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UNSUPERVISED SPATIAL CLUSTERING WITH SPECTRAL DISCRIMINATION 


I. INTRODUCTION 


The objective of this research is the conversion of remotely sensed data into 
useful information regarding the location and distribution of various classes of identifiable 
earth observation features. The remotely sensed data are collected from a platform, which 
may be an airplane, a ship, a balloon, or a satellite. The sensors used generally collect 
electromagnetic radiation in specified wavelength intervals (multispectral and thermal 
scanners, other types of radiometers, and multiband photography), echo returns 
(side-looking aerial radar and sonar), and magnetic field information (magnetometer) The 
collected data may be analog, digital, or a photographic image, and the data are 
formatted such that a one-to-one correspondence is preserved between the ground scene 
and the data, as in an aerial photograph. The data are then analyzed to determine what 
features can be extracted from the data and to determine the location and distribution of 
these features. Examples of features could be crop types, diseased crops, bodies of water, 
water pollution, and land types. Multitudes of other types of features exist. As presently 
used, there are three main types of feature extraction methods, which will be denoted by 
human photointerpretation, supervised computer feature extraction, and unsupervised 
computer feature extraction. The first two feature extraction methods involve human 
supervision and judgment after the fact, whereas the third method does not permit any 
change of criterion from datum to datum The unsupervised computer feature extraction 
method is based entirely on a logical set of mathematical rules designed to extract the 
features presented in the remotely sensed data This report discusses the three most 
commonly used types of feature extraction methods and presents a recently developed 
computer program for unsupervised feature extraction. 


II. SYNOPSIS ON FEATURE EXTRACTION 


The most universal method of feature extraction is the classical art of 
photointerpretation This art is primarily a process of visual inspection and subjective 
analysis by a trained human observer, and it has been greatly enhanced by a variety of 
instruments and machines (stereo viewers, microdensitometers, autoplotters, false color 
image enhancement, etc). The human’s role in this task is to subjectively combine visually 
acquired inputs relating to spatial properties, color, texture, and temporal phenomena to 
identify and classify objects in the ground scene [1, 2, 3] In performing this task, the 
human exercises a sophisticated ability to combine various types of data and draw 



conclusions as to information subtleties present. As an example of this sophistication, 
consider the land-use classifications, “pasture” and “improved pasture ” Improved pasture 
is a pasture that is surrounded by a fence, but from an aerial photograph, fences are 
often difficult to observe Grazing animals, however, develop the habit of walking along 
the fences and the photointerpreter merely has to look for a brown path surrounding a 
green field. The most obvious and serious inadequacy of photointerpretation is the 
manpower and time needed to analyze large volumes of data resulting from remote 
sensing on a large-area scale 

Because of this inadequacy, more and more emphasis is being placed on 
developing computer techniques for analyzing large volumes of remotely sensed data. 
Present computer techniques are designed to exploit the spectral information of imagery 
or of spectral scanners for extracting information. One approach to this technique is to 
use a data bank. The use of a data bank assumes that an object or an area of sufficient 
dimensions in the ground scene to be considered homogeneous possesses a unique spectral 
signature. The spectral signatures are produced by recording the returned energy 
(combined reflected and radiated electromagnetic energy) in each of a number of narrow 
and discrete wavelength intervals, The data bank would contain prestored or empirically 
derived statistics on the signatures of known objects or it would contain the ability to 
model and analytically calculate what a given signature should be. 

The difficulty with this technique is that there is appreciable variability in the 
signatures, which is due to natural temporal changes as well as manmade changes. In 
practice, this variability is evident even for repeated observations in which great care is 
taken to maintain constant sun angle, viewing angle, cloud cover, ambient temperature 
and humidity, instrument calibration, etc. This variability can be attributed to a large 
degree to a lack of microscopic homogeneity between grossly identical objects or species 
in two different ground scenes. For example, no two leaves of diseased corn will appear 
exactly the same, and no two rows of com will be arrayed exactly the same or be 
surrounded by the same color of ground exposed between the stalks Thus, the problem 
of signature comparison leads to a modeling theory [4] 

Another serious drawback to this technique is the amount of computer memory 
required to store all possible signatures and their variations, and a prohibitive amount of 
computer time required to compare an unknown spectral signature with all possible 
signatures stored in the data bank, unless a table look-up procedure can be used [5] 

The next approach to be described is probably the most popular and widely used 
computer feature extraction technique. This approach has been developed extensively at 
Purdue University [6, 7, 8, 9, 10] , and many related techniques [11, 12, 13, 14] and 
improvements have been derived from this approach. This technique will be called the 
“supervised classification technique” since it requires human supervision for the selection 
of training areas, which ultimately determine the number and types of features to be 
extracted, as well as the accuracy of classification. 
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To .describe this technique, it will be necessary first to describe the data collection 
and formatting, as well as to develop the mathematical notation for operating on the 
data. A multispectral scanner is generally used for the data collection The scanner is a 
monochromator with a detector array for recording the reflected and emitted radiation 
from the ground in different wavelength intervals. The monochromator is mounted in an 
aircraft, and the rotating mirror scans the ground scene as the airplane flies, producing an 
analog signal as shown in Figure 1 If the detector array contains n detectors, then n 
simultaneous signals or channels of data are recorded for the same ground scene The 
analog data are then digitized and recorded on magnetic tape to produce an 
n-dimensional digital image of the ground scene Let the algebraic value of the digital 
number derived from the analog signal be denoted by j^Xy where k = 1,... ,n, the 

channel number or wavelength interval, i = 1, 2, 3, the scan line number 

or x coordinate of the ground scene, and j = 1,2, , m, the sample number for 

scan i or the y coordinate of the ground scene 

Thus, the data set collected contains n channels of data, £ scan lines, 

and m digital samples per scan per channel with an amplitude given by ^Xy Each 

ground-scene coordinate, i and j , of the digital image is normally called a resolution 
element, and each resolution element is therefore described by an n-dimensional vector 
called a feature vector, Xy The components of the feature vector are xy = [jXy , 2 x ij v n xy ] 

The next step is to produce a boundary map [14] so that it may be compared 
with an aerial photograph taken of the ground scene at the same time that the 

multispectral scanner data were collected The boundary map and aerial photograph are 
shown in Section IV, and a method for producing a boundary map will be discussed in 
Section III. The photograph allows an observer to select, for example, areas which appear 
to be crops and then to locate the scan line and column number of the data 
corresponding to those crops from the boundary map. The data may then be accessed for 
analysis and used as a training area [15] 

For identification of crop species, ground-truth information is required, unless a 
data bank is available. The ground-truth information could be collected by an observer 
visiting each field or obtaining the information from a farmer as to the crop type It is 
interesting to note that this type of information collection has occasionally been 
erroneous, and present classification techniques have been sufficiently accurate to point 
out these errors. With this information, the fields in the aerial photograph can be labeled 
according to crop type. Training areas are then selected from the boundary map 
corresponding to particular known fields in the aerial photograph. The desired statistics 
are calculated from the data in the training area, and a statistical decision rule is used for 
classifying the remaining data in the digital image. A computer map of the ground scene 
is then printed out showing the location of all resolution elements that were classified 


3 








according to the statistical decision rule as belonging or being similar to the particular 
training areas selected. The computer map can be compared with the ground-truth 
information in the aerial photograph for accuracy Accuracies of 80 to 90 percent are not 
uncommon. 

One of the most widely used decision rules is the maximum likelihood ratio 
technique. The development of this technique depends upon two types of decision errors 
that can be made in the classification. The first type of error is not assigning a feature 
vector as belonging to a class when it actually does. The second type of error is assigning 
a feature vector to a class when in actuality it does not belong. Weights are usually 
assigned to both types of errors depending on how costly the error is in making the 
wrong decision. These weights, Ljj , are referred to as cost factors and are associated with 

classifying a feature vector, x , belonging to class i into class j 

Ultimately, the decision procedure must be able to indicate a final choice for each 
point in the n-dimensional feature space The feature space, R , must be divided 
into m mutually exclusive regions, Rj , R 2 ,..., R m . 

Let the probability density function for samples from class i be fj(x) and the a 
priori probability of occurence of class i be Pj The probability of misclassifying a 
sample from class i into class j is 


P(j | i) = / fi(x)dx 


( 1 ) 


and the conditional expected cost if the sample is from class i is 


C: 


m m 

Z L ij p o|‘) - Z L u / f i« dx 

j=1 j=l R j 


( 2 ) 


One useful criterion that is often used is the average cost which is given by 


C 


m 

, 5 , 


P iCi 


m m 

Z / Z L ij p i¥ x ) dx 

j=i Rj i=i 


(3) 
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and the regions R t R m are chosen to minimize the average cost. The average cost is 
minimized if the integrand 


m 

Jj Mjpihty 

i<= 1 


is a minimum, which is equivalent to deciding that a sample x is from class j if 


m m 

2 LyttfCx) < 2 Lik P i f i« for all k*j (4) 

i = 1 i = 1 

By subtracting 

m 

.Ej L u p i f i (x) 

i + j, k 

from both sides, the decision rule is obtained, x is in class j if 


f /*> > (Lkj - L kk )P k 
f k M ( L jk ~ L jj) p j 


Since the cost factors Lj^ and the a priori probabilities Pj are constants, the decision 

rule that minimizes the average risk is the ratio of two conditional probability densities 
or likelihood ratios with a threshold value. The cost factors and a priori probabilities are 
used only in determining the threshold value. With a simple choice of cost 
factors Ly ■= 0 for correct classification and Ly = 1 with i = j for misclassification 

and equal a priori probabilities, the decision rule reduces to deciding that x is in 
class j if 


f/x) > fj^x) for all k£ j 


( 6 ) 


• This is generally referred to as the maximum likelihood decision rule, 
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In actual practice, the cost factors, a priori probabilities, and the probability 
distributions are not known, but must be estimated from members of each training area, 
The various classification techniques that have been developed are essentially different 
methods for estimating the probability distributions. 

In most cases, a multivariate Gaussian distribution is assumed for the training area 
data, and the vector means, Mj , and covariance matrices, Vj , are calculated from the 

data in each training area. Because the Gaussian distribution is exponential, the logarithm 
of the decision rule is used. Thus, decide that x belongs to class j for all k = j if 


l°g e | v j I + < l°g e |v k | + (*-M k ) T V k *‘(?-M k ), (7) 


where j Vj | and Vj 1 are the determinant and inverse of the covariance matrix Vj and 
T denotes the transpose operation of a matrix [16] 

The advantage of the supervised technique is that with available ground truth, 
known areas of particular interest can be used to locate similar areas elsewhere in the data. 
The disadvantage of the supervised technique is the manual selection of training areas 
from the data, which, when applied to large data volumes, can become as tedious and as 
time consuming as the art of photointerpretation. This is especially true of high-altitude 
multispectral aerial photography and satellite data. 

In recent years, the amount of remotely sensed data collected on earth 
observations has been phenomenal, as evidenced by publications in the literature [17] 
The future outlook indicates that this data collection will continue to grow, as shown by 
Earth Resources Technology Satellites 1 and 2, the Skylab Earth Resources Experiment 
Package, and the Space Shuttle Sortie Laboratory Because the volume of data expected 
appears to be getting out of hand, much emphasis is being placed on the development of 
unsupervised techniques for feature extraction These computer techniques are designed 
to extract features from remotely sensed data without either the assistance and 
supervision of an observer or the prior benefit of ground-truth information, One 
promising technique has been developed by Su [18] which uses a sequential variance 
analysis in combination with an iterative K-mean algorithm approach, and References 19 
and 20 give a recent review of the state of the art of other techniques. The usual 
disadvantages of the unsupervised techniques are that a high degree of classification 
accuracy is often difficult to obtain and the computer time is relatively extensive. Thus, 
the specific problem to attack is the development of an unsupervised feature extraction 
program that classifies with a high degree of accuracy The computer time required can 
then be more efficiently utilized by optimizing the computer program logic, 
programing in an efficient machine language, and by using special-purpose computers. The 
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advantage of the unsupervised technique is that no prior information is required before 
the data analysis. The results obtained with the unsupervised technique are designed to 
show the location and distribution of the extracted features, but no identification of the 
features is possible without ground-truth information. Thus, the results of the 
unsupervised technique can be used for directing ground-truth patrols and the location 
and amount of ground truth needed can be accurately determined before any ground 
truth is collected. The adequate determination of where and how much ground truth to 
collect is an important economic consideration when data are collected on a global or 
even a regional basis. If data are collected on a seasonal interval, for example, future 
ground-truth collection can be minimized by monitoring the feature signatures as a 
function of season. If some of these signatures change significantly from past results, the 
classification map can be used to direct ground-truth patrols to update only those feature 
identifications which have changed. 

To maintain current information on current accomplishments in remote sensing, 
the “Proceedings of the International Symposia on Remote Sensing of the Environment,” 
published by the Willow Run Laboratories, University of Michigan, and the “Annual 
Earth Resources Aircraft Program Status Review,” published by NASA, are highly 
recommended along with the literature survey listed in Reference 1 7 

Section III discusses a proposed unsupervised computer program for feature 
extraction 


III. UNSUPERVISED FEATURE EXTRACTION 


Feature extraction from remotely sensed data is a very complex process. For this 
reason, no comprehensive model or explicit external criteria exist for defining and 
extracting all features. The selection of training areas for feature extraction introduces 
the bias of human observation and preconceived judgment into the classification criteria. 
For example, an observer may select a training area containing corn to attempt to classify 
corn in all other portions of the ground scene, while the data from that training area may 
only be capable of classifying all sparsely growing, small green plants surrounded by bare 
soil. 


The philosophy behind the development of this classification program is to accept 
tentatively an internal criterion, i.e., the data itself should naturally suggest the features 
to be extracted and subsequently identified. 

A flow chart of successive stages of the computer program is shown in Figure 2. 

The first stage of the program is to produce a boundary map of the data by 
separating the data into homogeneous and inhomogeneous areas. This is accomp fished by 
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Figure 2. Program logic flow. 
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computing the average feature vector spectral distance per channel or dimension in the 
direction of the x and y ground-scene coordinates, The formulas for this calculation are 
given by 

Vi 

( 8 ) 


where i and j are the scan line and scan-line column number, respectively, of the data in 
the digital image and the summation is over n channels of data, 

These spectral distances are calculated for each resolution element and stored in a 
joint probability distribution, P(s x , Sy). The average feature vector distance per dimension 

is computed to determine a measure of change present in the data from one resolution 
element to the next and also to keep the numbers occurring in the calculation in the 
same range, regardless of the number of dimensions used. The areas of the data where the 
spectral change, in either the x or y ground-scene coordinate direction, is equal to or 
less than the average spectral change will be classified as a homogeneous resolution 
element, otherwise, the resolution element will be classified as a boundary The average 
distances of the average feature vector spectral distance per channel are computed using 
the formulas 


n 


n 


% 


n 


S (k x ij-k x i-u) 


k- 1 


and Sy = 




II s x P ( s x ,s y) ds x dSy 

= // s y P ( s x> s y) ds x ds y 
and 

s x s y “ II s x s y P( s x ,s y) ds x dSy 



( 9 ) 

( 10 ) 


( 11 ) 


These calculations are used for defining a decision boundary in the joint probability 
distribution as to what combination of x and y spectral distances is classified as 
belonging to a homogeneous resolution element or a boundary resolution element. The 
decision boundary is in the shape of an ellipse which may have principal axes in 
directions other than s x and Sy The direction and magnitude of the principal axes are 

calculated [21] from the eigenvectors and eigenvalues associated with the matrix 
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( 12 ) 


s„s 


xy 


s x s y 


The equation of the ellipse in the principal axes coordinate system is 


'2 




(13) 


where s x and Sy are the distances s x and Sy rotated into the principal axes coordinate 
system by the eigenvector transformation, s x 2 and Sy 2 are the eigenvalues or half 
lengths of the principal axes squared 


1 



0 


0 


1 



(14) 


is then rotated back into the s x and s y coordinate system by the inverse eigenvector 

transformation to obtain the equation of the ellipse on the decision boundary in 
the s x and s y coordinate system The equation of the ellipse is given by 


as x 2 + bSy 2 + cs x Sy 


( 15 ) 


where a, b, and c are determined from rotating equation (14). The decision is to classify 
a resolution element as being homogeneous if 


as x 2 + bSy 2 + cs x Sy < 


( 16 ) 
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A digital image of a boundary map is recorded on magnetic tape for use in the second 
stage of processing. The magnetic tape utilizes -1 to describe a boundary element and zero 
for a homogeneous element. 

The second stage is concerned with the selection and spatial merging of 
unknown candidate features based upon the homogeneity of the ground scene, as 
displayed by the boundary map recorded on the magnetic tape. Because the boundaries 
on the map are not closed and have open gaps in some cases, the problem is to select 
homogeneous areas with a mathematical logic that will prevent these areas from 
containing a mixture of different features. This is accomplished by using a fixed 
shape p x p resolution element array which moves through the boundary map only in 
the x or y ground-scene coordinate direction. Initially, a homogeneous area in the 
boundary map is found which is large enough for the array to fit into. The area covered 
by the array is designated as belonging to cluster 1 The array is allowed to move in this 
area until a boundary is encountered and then the direction of movement must be 
changed. All resolution elements falling within the movement of the fixed array are said 
to belong to cluster 1, and the zeros previously occurring on the boundary map in these 
locations are changed to +1 After the array can no longer move and engulf new 
resolution elements, another location is found which will contain the p x p array All 
resolution elements fitting into this array will be designated as belonging to cluster 2. The 
process is repeated until all of the boundary map data have been exhausted Clusters 
which physically touch on the boundary map are merged and called the same cluster 
This is called spatial merging, which will be contrasted with spectral merging later After 
spatial merging, the clusters are renumbered so that the cluster numbers will have a 
continuous range from 1 , — ,N. The output of this second stage is another map 
containing the numbers -1,0, 1, 2,.. ,N, with -1 indicating boundary resolution elements, 
0 indicating homogeneous resolution elements not encountered by the moving fixed 
shape array, and 1, 2, ..,N indicating resolution elements belonging to clusters 1, 2,...., N, 
respectively The fixed-shape array, if chosen large enough, will not permit the mixing of 
features because the open gaps in the boundaries will be so small compared to the array 
size that the array will not be able to pass through the boundary On multispectral 
scanner data taken at an altitude of 792.5 m (2600 ft), a 10 x 10 array is sufficient to 
keep different features separate, This also provides a minimum sample size of 100, which 
is a very adequate sample on which to base statistical calculations. Typically, most 
multispectral data are collected at higher altitudes so that a 10 x 10 array will, in 
general, be sufficient to prevent the mixing of features. The magnetic tape containing the 
boundaries and locations of clusters and the magnetic tape containing the raw data are 
the inputs to the third stage of processing. 

The third stage of processing is concerned with spectral merging of the selected 
unknown candidate features. The boundary and cluster map tape gives the locations of 
the raw data on the raw data tape belonging to each cluster The mean feature vectors 
and covariance matrices, c, are calculated for each cluster using the formulas 
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(17) 


k x £ 




k x i£ 


and 


i N e 

km c £ “ n 7 Z k x i£ m x i£ " (k x fi) (m x fi) 
K i = 1 


(18) 


where x is the algebraic value of the i^ 1 sample in the k or m** 1 channel of the £^ 
cluster containing Ng samples. The mean value of the samples in the k** 1 channel of the 
cluster is given by ^Xg , while the covariance between channels k and m of the £“* 
cluster is given by j^Cg These calculations are used to define decision boundaries with 

which to physically surround the data belonging to a cluster in n-dimensional space. The 
most general closed surface that can be used to surround the n-dimensional data is an 
n-dimensional hyperellipse. The centroid of the cluster ellipse is given by the feature 
vector mean values ^Xg , while the principal axes direction and magnitude can be 

determined by the eigenvector transformation as was discussed when equations (12), (13), 
and (14) were introduced. 


In a statistical sense, the eigenvector transformation on the covariance matrix 
locates the direction of orthogonal principal axes for which the variances are minimized 
and maximized [22] . The variances are the diagonal elements of the covariance matrix 
and the off-diagonal elements (covariances) are made zero by the transformation, Thus, 
the equation of an n-dimensional ellipse in reduced form is obtained for each cluster, 
and, in general, each cluster will have a different coordinate system. The next step is to 
derive a decision rule for determining how many clusters actually represent the same 
feature. This decision is now based entirely upon spectral information rather than the 
spatial information which was used in equations (12), (13), and (14). The decision rule is 
that two clusters represent the same feature if the centroids of both clusters are 
contained in both clusters’ ellipses. Although this spectral-merging procedure was derived 
from physical intuition, a theorem [23] was located which adds mathematical precision. 
The theorem states that the orthogonal transformation which minimizes the mean-square 

ft. 

distance between a set of vectors from the £ tn cluster, subject to the constraint that the 
volume of the space is invariant under transformation, is a rotation, Eg , followed by a 

diagonal transformation, Wg The rows of the matrix Eg are the eigenvectors of the 
covariance matrix, Cg , of the set of vectors, and the elements of Wg are those given in 
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equation (19), where j^Cgl^ is the standard deviation of the coefficients of the set of 
vectors in the direction of the k^ 1 eigenvector of eg 

The diagonal elements of the diagonal transformation, Wg , are given by 



The rationale behind this theorem [23] merits some discussion as applied to spectral 
merging. This discussion will also apply to spectral classification, which is to be described 
later 


It is desirable to have a measure of similarity between two clusters, S -1 , for 
deciding whether or not two clusters are to be merged, Let "v be the mean feature vector 
of one cluster, with components given by equation (17), and let j Xj^ j be the entire set 

of feature vectors contained in the other cluster with the subscript m denoting 
the mth vector of the set. The similarity may be regarded as a mean square spectral 
distance and should describe the closeness of v to the entire set of feature 
vectors, j x m j According to the philosophy of Reference 23, the definition of 

“distance” does not necessarily mean Euclidean distance, but may mean “closeness” in 
some arbitrary, abstract property of the set (x m ) which has yet to be determined The 

use of an undetermined distance measure does not alter the definition of similarity, but 
provides an ordering which similarity lacks. Mathematically the similarity S' 1 [v,(x m )l 
of a feature vector ~v and a set of feature vectors (x m ) exemplifying a cluster can be 
written as 



1 


M 


M Z 

m = 1 



( 20 ) 


where the distance measure, d 2 ( ), has not yet been specified The inverse of S is used 

because the smaller the distance, the more the similarity 

One possible way to choose the distance measure is to utilize the knowledge that 
the set (x m ) describes one feature, and, therefore, the members of the set should all be 

very similar or close. Furthermore, the members of the set could be made even more 
similar by using feature weighting coefficients, j^wg , and minimizing the average 

intercluster distance of all the members of the set. Thus, the equation to be minimized is 
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M g Mg n 

D £ 2 = Mp (Mg - 1) 2 E Z kkV (k x m e -k x p c ) = minimum, (21) 

pg — 1 ni£ “ 1 k — 1 ' 

where k represents the k^ 1 component of the feature vector and mg and pg represent 
the m** 1 and p** 1 feature vector of the set £ containing Mg members. 

To minimize equation (21), some constraint must be placed on the feature 
weighting coefficients. Several alternatives are possible, but the most appealing constraint 
is 


n 

k 2j kk w £ = 1 (22) 


If the feature weighting coefficients are considered to be the dimens ions of a hypercube, 
then equation (22) is a constant volume constraint. To minimize Dg 2 , equation (21) 

needs to be written in a more convenient form, 



2Mg 

(Mg-1) 


n 

E kk w eW 

k = 1 


where j^crg is the variance of the dimension of the feature vectors in set £ Using 
the constraint and an undetermined multiplier, the minimizing equation becomes 


kk dw £ 


( 


kk w £k a £ 



0 for j = l,2,. 


(24) 
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which may be rewritten as 


kk w G 


VT 

k a £ 


( 25 ) 


Using the constraint for determining X gives 


kk w £ 




(26) 


The feature weighting coefficients indicate that if the variance in a particular dimension is 
small, then the value of the feature can be anticipated with a high degree of accuracy and 
should be heavily weighed On the other hand, if the variance is large, little weight should 
be attached to that feature. 

Equation (26) is the diagonal transformation discussed in the theorem 
encompassing equation (19) and is identical to equation (19). As the theorem states, this 
distance measure can be additionally minimized by applying the eigenvector 
transformation. It follows by equations (18) and (21) that the similarity criterion 
[equation (20)] for deciding whether to merge two clusters k and £ is given by 



the Mg feature vectors belonging to cluster 2 

Substituting equation (19) for p p Wg 2 gives 
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(28) 




^p v k ~ p x £^ 

pp c £ 


+ n 


The decision for merging two clusters will now depend on the threshold value given to 
the measures of similarity for both clusters. A threshold value can be determined by 
calculating the average similarity within a cluster, which is given by 



Using equation (29) as a threshold value for equation (28) gives the decision rule, merge 
clusters k and £ if 


V (p Vk ~ P Xg ) < n 
p = 1 pp c £ 


Notice that equation (30) is the equation of a hyperellipse in the principal axes 
coordinates, which was earlier derived by physical intuition, Notice also that the 
threshold value n is independent of cluster and depends only on the dimension of the 
feature space. Thus, if an elliptical boundary decision rule is used in the principal axis 
coordinate system, the theorem can be extended to say that the diagonal transformation 
is not needed and only the eigenvector transformation is needed since the threshold can 
always be written as some constant times 




7 n 
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After the initial M set of clusters has been merged into a final N set of clusters, 
with N < M , the eigenvector rotation matrix and the equation for the hyperellipse in 
principal axes coordinates are stored in memory for each final cluster. The clusters are 
now called classes since each class represents a statistically different feature presented by 
the data. The regions in feature space Ri , R 2 ,...., R^ corresponding to each class are 

distinct since no more merging is possible. This feature extraction information is now 
ready for use in the last stage of processing. 

The final stage of processing is concerned with classifying the data in the digital 
image of the ground scene and showing the location and distribution of the features. The 
inputs to this stage of processing are the raw data tape, the statistics for each class, and 
the boundary tape. The decision rule for classifying a resolution element feature 
vector v" into class C is given by 

2 « „ , (3D 

p = 1 2 pp c £ 


which is the same as equation (30) except for the factor of 2. The factor of 2 is justified 
based on the following arguments, First, the decision rule for classifying a resolution 
element can be more lenient than the decision rule for merging two clusters. The errors 
of mismerging are more pronounced in the classification than the misclassification of 
individual resolution elements. Secondly, the expression on the left side of the inequality 
in equation (31) is identical to the argument in the exponential of a disjoint multivariate 
Gaussian distribution. Thus, a resolution element is not classified as belonging to a class if 
the n-dimensional exponential n folds. Further justification for using equations (30) and 
(31) and the threshold values of n and 2n, respectively, is that the terms contained in the 
summation are chi-square distributed, and a chi-square distribution has a mean value n 
and variance 2n, where n is the number of degrees of freedom [24], 

The data obtained from the raw data tape are classified according to the decision 
rule in equation (31), and the result of the classification is updated on the boundary tape 
map. For example, if a resolution element belongs to class 3, a 3 is placed on the 
boundary tape at the location of the resolution element, If a resolution element is not 
classified as belonging to any class, no change is made on the boundary tape. If a 
resolution element can be classified as belonging to several classes, the resolution element 
is placed in the class which makes the left side of equation (31) a minimum. The 
boundary tape is now called a classification tape and contains the numbers -1, 
representing an unclassified boundary resolution element, 0, representing an unclassified 
homogeneous resolution element, and 1, 2,...., N, representing resolution elements placed 
in classes 1, 2,...., N, respectively 
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If the initial size of the pxp array for cluster selection is too large, an 
incomplete classification of the ground scene will result. The computer program has the 
capability for using the classification map as a boundary map, treating the classified 
resolution elements as also being boundaries, decreasing the pxp array to a q x q array, 
and selecting additional clusters. All previous information obtained on the classes is 
updated, if appropriate, and the classification map is reclassified using the old unchanged 
information and the new updated information. This procedure can be repeated as many 
times as desired, but two classification passes are generally sufficient. 

Because of the large amounts of data involved, the output of the classification 
map is nominally put on microfilm rather than standard computer paper printout. This is 
evidenced by the fact that the classification map from a 70-mm aerial photograph, using 
standard computer printout, can easily cover a 37.16-m 2 (400-ft 2 ) wall 

The results are given in Section IV 


IV. RESULTS 


Classification maps were obtained from the analysis of two data sets. Purdue’s 
Flight Line Cl and the Yellowstone Park test sites. The advantage of working with these 
two sets of data is that they have been extensively analyzed by other investigators using 
different feature extraction techniques. The results of these investigations are mainly 
available in References 6, 11, and 1 2 for comparison 

Both data sets were acquired with the same multispectral scanner and contain 12 
channels of data. All 12 channels were used in the classification, and the wavelength 
intervals corresponding to each channel are listed in Table 1 The data sets contain 222 
12-dimensional feature vectors per scan, and 901 scans from each set were analyzed The 
data are also uncalibrated and, therefore, only contains relative numbers. The similarity 
of the two data sets drastically comes to an end with the above description 

Flight Line Cl is a very flat agricultural scene approximately 6.5 km long and 1.6 
km wide, and the resolution of the data is approximately 6 m The ground scene mostly 
contains rectangular patterns of crop acreage which appear homogeneous. 

Yellowstone Park, However, is a wilderness area approximately 16 km long and 
3.2 km wide with a resolution of approximately 15 m. The wilderness area contains very 
irregularly shaped patterns and is quite inhomogeneous in some locations. This data set 
also possesses a considerable dynamic range in the type of terrain and amount of features 
present. For example, there are mountains and canyons, and the vegetation coverage 
ranges from dense forest to scattered trees, meadows, and, finally, to bare rocks. The 
geologic information ranges from a sand and gravel base, abundantly sprinkled with elk 
droppings in the meadow areas, to rocks covered with moss and lichen, and, finally, to 
large boulders. 
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TABLE 1 CHANNEL AND WAVELENGTH CORRESPONDENCE 


Channel No. 

Wavelength Interval (pm) 

1 

0.4 - 0.44 

2 

0.44 - 0.46 

3 

0.46 - 0.48 

4 

0.48 - 0.50 

5 

0.50 - 0.52 

6 

0.52 - 0.55 

7 

0.55 - 0.58 

8 

0.58 - 0.62 

9 

0.62 - 0.66 

10 

0.66 - 0.72 

11 

0.72 - 0.80 

12 

0.80 - 1.0 


Cl Flight Line 

Figure 3 contains an aerial photograph of the ground scene on the left, a 
boundary map in the center with the locations of the initial clusters, and a classification 
map of the ground scene using statistics from these clusters. The aerial photograph also 
contains a list of symbols that identifies the contents of the ground scene The list of 
symbols and their identification are given in Table 2. Examination of the results obtained 
from the multispectral scanner data reveals a problem that does not occur in multiband 
photographic data The aircraft acquiring the scanner data was not quite able to fly a 
straight line and had occasional yaw problems. The scanner data are roll compensated, 
however, as is the case with most scanners. 

The computer maps shown in Figure 3 demonstrate the first two-out-of-three 
intermediate outputs that can be obtained from the computer program and are 
considerably scaled down to show that the pattern in the data has a definite resemblance 
to that of the aerial photograph. Hence, it is not possible or necessary at this stage to be 
able to read the symbols on the computer maps. 

The boundary map contains 79 different cluster locations after spatial merging. A 
10x10 array was used to select these clusters. The first cluster is at the top of the map 
just to the right of the road, while the second cluster is to the right and just below 
cluster 1 Cluster 3 is the top left cluster in a cornfield, while cluster 4 is the first cluster 
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TABLE 2. GROUND TRUTH INFORMATION 


Symbol 

Description 

A 

Alfalfa 

C 

Corn 

H 

Hay 

0 

Oats 

P 

Pasture 

R 

Rye 

S 

Soybeans 

T 

Timothy 

W 

Wheat 

B S. 

Bare Soil 

D.A. 

Diverted Acres 

R.C. 

Red Clover 


in the soy field below the wheatfield to the left on the road. Cluster 5 is the first cluster 
directly below clusters 1 and 2 and is also located in a soyfield Cluster 6 is in the same 
soyfield as cluster 4 and is just below cluster 4, while cluster 8 is just to the right of 
cluster 6, The other clusters follow using the same computer pattern logic. Only 43 single 
character output symbols are available to name the 79 clusters, and, therefore, the symbol 
usage is recycled starting with cluster 44. Tables 3 and 4 give the original cluster 
numbers, ground-truth information, a description of the spectral merging process, and the 
final class number. Remember that when several clusters are spectrally merged, the 
number given to the merged cluster is the smallest cluster number of the clusters involved 
in the merging, and some of the remaining clusters not involved in the merging may have 
their cluster numbers changed to keep the numbering of the clusters in a consecutive 
order. This is illustrated in the spectral multiple merging columns of Tables 3 and 4. If 
no ground truth is available for a cluster, the letter U is used to indicate that it is 
unidentified. Examination of Table 3 multiple merge number 1 reveals that the statistics 
of clusters 1 and 2 were merged together and the result was called cluster 1 Cluster 3 
did not merge with cluster 1 , but was renamed cluster 2 so that consecutive numbering 
would be preserved. Cluster 4 was renamed cluster 3 because it would not merge with 
clusters 1 and 2, and cluster 5 was renamed cluster 4 because it would not merge with 
clusters 1, 2, or 3 Cluster 6 would merge with cluster 3 and the statistics of cluster 3 
were updated to include the statistics of cluster 6. Finally, cluster 23 was able to merge 
with clusters 8, 9, 11, and 12, and the statistics of cluster 8 were updated to include the 
statistics of clusters 9, 11, 12, and 23. The renaming of the clusters resulting from this 
multiple merging is shown in multiple merge column number 2, and this column 
continues until another multiple merge is encountered. The classification map resulting 
from the merging and first classification pass is shown on the right side of Figure 3, This 
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TABLE 4. MERGING PROCEDURE FOR CLUSTERS 41 THROUGH 79 


Cluster No. 

Identification 

Spectral Multiple Merge No. 

Final 

4 

5 

6 

7 

Class No. 

41 

C 

14 

14 

14 

14 

14 

42 

U 

15 

14 

14 

14 

14 

43 

s, c 

3 

3 

3 

3 

3 

44 

u 

14, 15 

14 

14 

14 

14 

45 

0 


15 

15 

15 

15 

46 

w 


10 

10 

10 

10 

47 

s 


16 

16 

16 

16 

48 

u 


16 

16 

16 

16 

49 

DA-R 


17 

17 

17 

17 

50 

DA-R 


18 

17 

17 

17 

si 

P-0 


19 

18 

18 

18 

52 

DA-R 


17, 18 

17 

17 

17 

53 

DA-R 



17 

17 

17 

54 

DA-R 



17 

17 

17 

55 

0 



19 

19 

19 

56 

0 



19 

19 

19 

57 

C 



8 

8 

8 

58 

C 



20 

20 

20 

59 

C 



8 

8 

8 

60 

S 



21 

21 

21 

61 

s 



22 

21 

21 

62 

s 



21,22 

21 

21 

63 

s 




21 

21 

64 

s 




21 

21 

65 

c 




7 

7 

66 

s 




22 

22 

67 

s 




7 

7 

68 

s 




5 

5 

69 

s 




5 

5 

70 

s 




5 

5 

71 

s 




3 

3 

72 

s 




5 

5 

73 

s 




23 

23 

74 

u 




1 

1 

75 

s 




22 

22 

76 

s 




23 

23 

77 

s 




24 

24 

78 

u 




3 

3 

79 

u 




3 

3 


2 ‘ 












map contains 24 classes, or features, as indicated by this highest number in the last 
column of Table 4. Because the classification was incomplete, additional clusters were 
selected by a 6 x 6 array starting with cluster number 25, and using the classification map 
as input for the cluster selection rather than the boundary map. The merging procedure 
is listed in the last column of Table 5 since no multiple merges were encountered. The 
computer program only allows for 43 classes and, therefore, refused to accept any 
additional data after cluster 48, as indicated in Table 5. Table 6 lists the initial cluster 
numbers of all clusters that are located in the same field for both classification passes. 
This provides an additional check to determine whether the merging was conducted 
properly and to assist in locating the first classification pass clusters shown in Figure 3 if 
desired. The reason that one field may have several clusters is that boundary points 
within a field may not permit spatial merging of these clusters. Spectral merging is used 
to overcome this problem. 


TABLE 5 MERGING PROCEDURE FOR CLUSTERS 25 THROUGH 52 


Cluster No. 

Identification 

Final Class No 

25 

WATER 

25 

26 

C 

26 

27 

C 

26 

28 

DA-RC 

27 

29 

S 

28 

30 

s 

29 

31 

DA-RC 

30 

32 

DA-RC 

31 

33 

P 

32 

34 

DA-RC 

33 

35 

C 

34 

36 

DA-RC 

35 

37 

DA-RC 

30 

38 

P 

36 

39 

C 

37 

40 

C 

37 

41 

C 

38 

42 

H-RC 

30 

43 

H-RC 

30 

44 

C 

39 

45 

0 

40 

46 

0 

41 

47 

0 

42 

48 

U 

43 

49 

RC 


50 

RC 


51 

0 


52 

RC 
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TABLE 6. MULTIPLE CLUSTER FIELDS 


Classification Pass 1-10x10 Array 

Classification Pass 

2-6x6 Array 

Initial Cluster No. 

Identification 

Initial Cluster No. 

Identification 

1,2 

U 

26,27 

C 

4,6,8 

S 

31,32,34,36,37 

DA-RC 

5,7,9 

S 

39,40 

C 

12,13,14 

C 

42,43 

H-RC 

17,19 

s 

45,46,47,51 

O 

21,22,23,24 

c 



25,27 

s 



31,32 

0 



34,35,38,40 

w 



49,50,52,53,54 

DA-R 



55,56 

O 



57,58,59 

c 



60,63,64 

s 



66,73,75,76 

s 



67,77 

s 



68,69,70,71,72 

s 




Table 7 lists the final class number, computer symbol printout, and a brief 
description of the class or feature based upon the available ground truth. 

A user may now desire to interpret the results for his specific needs, which, for 
example, may be crop identification. Table 8 was prepared for this example and for 
examining the final results. Classes 1, 7, D, and / are not listed in the table because a 
specific crop name or feature could not be attached. Notice that classes 7, D, and / occur 
at the edges of the computer map. 

Figures 4 through 15 are the final classification results for flight line Cl, and 
Figures 4 through 9 correspond to the left side of the aerial photograph, while Figures 10 
through 15 correspond to the right side. 

In 'Figure 4, water is represented by the letter O and wheat by the number 0. The 
0’s have a slightly rectangular shape as compared to the letter O. Figures 7 through 9 
start to show several areas that were not classified. This is because the maximum of 43 
classes was obtained in the area of Figures 7 and 13, and the remaining unclassified 
resolution elements were significantly different from the 43 previously obtained features. 
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TABLE 7 FEATURE SYMBOL AND DESCRIPTION 


Computer 

Symbol 



Brief Description and Comments 


Unclassified boundary resolution element 
Unclassified nonboundary resolution element 
Unidentified - Classified as corn in Reference 6 
Com 

Mixture - 84% soy, 8% corn, and 8% unidentified 
Soy beans 
Soy beans 
Bare soil 

Mixture - 5 1 % soy and 49% corn 

Mixture - 73% corn and 27% soy 

Corn 

Wheat 

Corn 

Oats 

Com 

Undecided - Probably corn 
Oats 

Probably all soy - 89% soy and 1 1 % unidentified 

Diverted acres and rye 

Pasture and Oats 

Oats 

Corn 

Soy beans 
Soy beans 
Soy beans 
Soy beans 
Water 
Corn 

Diverted acres and red clover 
Soy beans 
Soy beans 

Diverted acres and red clover 
Diverted acres and red clover 
Pasture 

Diverted acres and red clover 
Corn 

Diverted acres and red clover 

Pasture 

Corn 

Com 

Com 

Oats 

Oats 

Oats 

Unidentified 
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TABLE 8. USER INTERPRETATION OF RESULTS 


Category 

Computer Map Symbol 

Corn 

2)8,9 ,A,C,J,P,X,=,$„ 

Soy 

3,4,5,F,K,L,M,N,R,S 

Bare soil 

6 

Wheat 

0 

Oats 

B,E,H,I,’,(,* 

Diverted acres - Rye 

G 

Diverted acres - Red Clover 

Q,T,U,W,Y 

Pasture 

v,z 

Water 

0 


According to Reference 6, the majority of misclassifi cation and non classification 
can be attributed to weed growth and low-lying areas within different fields. This 
probably accounts for the appearance of boundaries in the fields presented in these 
results, e.g., Figure 10 contains a horizontally presented wheatfield near the bottom with 
a misclassification of oats in the middle. This misclassification is caused by the presence 
of a low-lying area. The boundaries in the diverted acres-red clover field just above this 
wheatfield are caused by the presence of a small sand dune. 

It may be of interest to note that there are usually more features representing a 
row crop such as corn and soy and that the non-row crop fields such as wheat and oats 
are more homogeneously classified per field, This result is probably because of the sensor 
having to average not only over the canopy structure of a row crop but, in addition, 
averaging over a percentage of bare soil observable between the plants. The type of soil 
could also vary from location to location. 


Yellowstone Park 

Figure 16 contains a video reprint from channel 9 of the ground scene on the 
left, a boundary map in the center with the location of the initial clusters, and a first 
pass classification map with an additional selection of clusters. 
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Figure 4. Final Cl classification map — Section 1 . 
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Figure 5. Final Cl classification map — Section 2. 
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Figure 11. Final Cl classification map - Section 8. 
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Figure 15 Final Cl classification map - Section 12. 






The Yellowstone Park data were analyzed exactly as the Cl flight line data. The 
video reprint contains some ground truth and locations of training areas used by other 
investigators listed in References 11 and 12, 

Table 9 gives the merging procedure for both classification passes since no 
multiple merges were encountered No identification is given for the clusters because the 
cluster locations do not necessarily coincide with the training areas, and it is easier to 
identify the features after the final classification. The analysis of this data set indicates 
the type of problem involved when an inaccessible region is remotely sensed It is 
difficult to estimate what types of features can be extracted from the data, and it may 
be economically advantageous for the investigator to collect the ground-truth information 
based upon the feature extraction map, rather than trying to anticipate the needed 
detailed ground truth 

Table 10 was prepared for interpreting the final results based on information 
derived from the video reprint. Several meadow-like areas were discernible, but were only 
given meadow numbers corresponding to their class numbers because of lack of other 
information The geologic terms till, talus, kame, and vegetated rock rubble are described 
in References 11 and 12, but are repeated here for convenience, 

The class till consists of meadow areas underlain by glacial till They are grassland 
and sagebrush areas which were largely dormant at the time of data collection. Mineral 
soil is exposed in about one-fifth of the area and consists of silty to bouldery debris. 
Deer and elk manure are locally abundant in these areas. 

The class talus includes blockfields, talus, and talus flows of basalt lava flows, 
volcanic tuff, and gneiss, formed by frost-riving and solifluction from outcrops. They are 
blocky and well-drained deposits, and trees are widely spaced or absent. The blocks are 
covered with dark gray lichens and range from a few cm to about 1 m in diameter Most 
are larger than 10 cm. The slopes in these areas range widely from 35 to 45 deg at the 
head to 5 deg or less at the toe, 

The class kame is very similar to till except about one-fourth of the area is 
exposed mineral soil 

The class vegetated rock rubble consists of locally derived angular rubble, 
frost-riven from basalt lavas, volcanic tuff and breccia, and gneiss. Grass, lichens, 
evergreen seedlings, and moss cover more than three-fourths of the surface underlain by 
this debris. The rocks range in diameter from less than 1 cm to about 1 m and occur on 
slopes from 0 to about 25 deg. 

Three additionally known features in the ground scene did not appear in the 
classification map because they were not contained in a homogeneous area large enough 
to be selected by a cluster These features were water, bedrock, and bog. However, the 
areas where they appear on the classification map can be located with the aid of the 
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TABLE 10 INTERPRETATION OF YELLOWSTONE RESULTS 


Class No. 

Symbol 

Brief Description 

1 

1 

Meadow 1 

2 

2 

Meadow 2 

3 

3 

Meadow 3 

4 

4 

Dense forest and shadow 

5 

5 

Till 

6 

6 

Kame 

7 

7 

Meadow 7 

8 

8 

Meadow 8 

9 

9 

Meadow 9 

10 

0 

Meadow 10 

11 

A 

Trees 

12 

B 

Trees 

13 

C 

Talus 

14 

D 

Till 

15 

E 

Till 

16 

F 

Till 

17 

G 

Till 

18 

H 

Till 

19 

I 

Vegetated rock rubble 

20 

J 

Meadow 20 


video reprint, and they appear as unclassified areas. These classes could now be identified 
elsewhere in the data by using a supervised technique and selecting these unclassified 
resolution elements as training areas. 

Figures 17 through 28 contain the final classification results for the Yellowstone 
Park test site. Figures 17 through 22 correspond to the left side of the video reprint, and 
Figures 23 through 28 correspond to the right side of the video reprint. 
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Figure 17. Final Yellowstone Park classification map - Section 1. 
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Figure 18. Final Yellowstone Park classification map - Section 2. 
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Figure 19. Final Yellowstone Park classification map - Section 3 
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Figure 21 Final Yellowstone Park classification map - Section 5 
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Figure 22. Final Yellowstone Park classification map - Section 6. 
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Figure 23. Final Yellowstone Park classification map - Section 7 
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Figure 24. Final Yellowstone Park classification map - Section 8. 
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Figure 25 Final Yellowstone Park classification map - Section 9 
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Figure 26. Final Yellowstone Park classification map - Section 10. 
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Figure 27 Final Yellowstone Park classification map - Section 1 1 
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Figure 28. Final Yellowstone Park classification map - Section 12. 



V. CONCLUSIONS 


The feature extraction can be interpreted as being an extension to photography. 
Photography is capable of presenting patterns in colors or various shadings for 
interpretatio'n by an observer, but, often, some features appear which are similar, but 
they are not similar, and vice versa. Computer feature extraction adds the capability to 
distinguish subtle differences in multiple digital data images and to make a decision 
concerning the similarity of those features in question. This extension can be exploited to 
its fullest limits, in some cases, when a data bank can be used for actual identification of 
the features. 

The purpose of this work was to demonstrate the capability of extracting features 
from digital data images without involving an observer in the data-processing loop and to 
compress unmanageable data into manageable and useful information, An observer would 
still be needed to interpret the results, 

The success bf the data compression is significant because of the 12 channels of 
data that were compressed into 1 and because of 200 022 resolution elements that were 
reduced to a maximum of 45 distinct categories. 

The computer programs presented in this work were purposely developed to be as 
general as possible, and the ultimate success of these computer programs for information 
extraction can only be determined when the programs are tailored and applied to solving 
a specific user or user agency’s needs. The present utilization of these programs for 
feature surveying, such as the vegetation and geological categorizations presented in the 
text, can be interpreted as being successful since the comparison of results with those 
previously published in the references appears very favorable 

The two keys to success in using the unsupervised feature extraction program are 
the production of a boundary map which cleanly separates homogeneous areas belonging 
to different features and the choice of the decision rule for spectrally merging similar 
features, 

The over abundance of boundaries in the Yellowstone data indicates the need for 
improvement in the mathematical definition of a boundary or at least a means for 
improving the threshold using the present boundary definition, If repetitive coverage were 
available for a particular test site, the optimum threshold for a boundary decision could 
probably be determined. Otherwise, the experience gained from working with other types 
of data sets or from perusing the literature would have to provide the impetus for 
determining a better mathematical definition of a boundary The two most important 
properties that need to be considered for this definition are most probably the resolution 
and some measure of roughness in the pictorial scene of the digital data image. 
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The use of a p x p array for obtaining initial cluster statistics in a homogeneous 
area may have an advantage over the manual selection of training areas since the cluster 
areas can have a fairly arbitrary shape rather than being rectangular and the data selected 
from these areas do not contain unusual data points which could bias the statistical 
calculations. The use of the pxp array does have the definite advantage because it is 
faster than manually selecting the coordinates of the training area and entering these 
coordinates into the computer 

The use of an elliptical decision rule for merging and classification appears to be 
very acceptable as evidenced by the fact that practically all of the resolution elements 
used for calculating the statistics of a feature are classified as belonging to that feature. 
This fact is even evident when several initial clusters have been merged to form a final 
class. Additional supporting evidence to this conclusion is that the initial clusters selected 
on the second pass through the data did not merge with any of the final classes obtained 
from the first pass through the data, and an almost unnoticeable amount of resolution 
elements had their original classification changed. Most of the error in classification 
occurs near boundaries and near the edges of the scan lines The first type of error is 
probably caused by the data changing from one feature to another in the vicinity of a 
boundary and in the process passing through a decision region of a third different 
feature. The misclassifieation near the edge of the data is because of an optical effect 
called scan angle error The angle at which the ground scene is viewed at the edge of a 
scan is usually 30 to 40 deg off from the local vertical and, as a consequence, the signal 
that is recorded by the sensor reflects an angular dependence It is reasonable to assume, 
however, that the use of an elliptical decision rule could consider the angular dependence 
The angular effect should be approximately linear dependent for the different channels of 
data and this would amount to a length stretching of the principal axes of the 
classification ellipses. It is apparent that the classification maps probably contain more 
detailed information than is actually desired. The detailed information present can be 
further compressed by visual interpretation and manually merging the desired features by 
changing the symbol output on the classification map. 

When the features are manually merged, caution must be exercised in interpreting 
a given feature extracted from the data, e.g., there are several features which represent 
soy, corn, and a mixture of com and soy Although there is a temptation to attach a 
simple description that is commonly used, the description may be incomplete with 
respect to the information presented by the data, and a logical manual merging may not 
be possible. Detailed ground truth would be needed to provide the qualifications and 
adjectives for a complete description For this reason, it may be important to perform 
the data analysis before prejudging the information content of the data rather than using 
the ground truth to assist in the analysis of the data. 

Finally, it must be emphasized that the development of the unsupervised feature 
extraction computer programs was directed toward obtaining a computer logic that could 
extract information from remotely sensed data with a moderate degree of success, which 
meant that computer running time necessarily took a “back seat.” Since the development 
work has been completed, efforts can now be directed toward optimizing the computer 
time and efficiency of the programs. 
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APPENDIX A. PROGRAM DESCRIPTION 


The computer programs which utilize the equations in the text are written in the 
form of subroutines and have been included as an integral part of a much larger 
computer program called an Earth Resources Processor. This processor contains several 
preprocessing and data display routines in addition to the classification programs and is 
documented and flow charted in detail in the final report of contract NAS8-26797 by 
IIT Research Institute, Chicago, Illinois [25] . Hence, only a brief description of these 
subroutines will be given. A small executive program can be written for controlling the 
sequence of calling these subroutines by following the logic flow previously described in 
Figure 2 and discussed in Section III. 


Subroutine BWNDR3 

Subroutine BWNDR3 is the first stage of processing or the boundary program and 
reads the raw data tape as input. This subroutine contains equations (8) through ( 1 6) and 
the logic for using these equations. The boundary program also contains two subroutines 
which will be discussed. Subroutine GET6 is used for getting the data off digital tape and 
putting it into the computer The content of this subroutine has to be specialized to the 
type of digital tape format desired. Subroutine JNTPB is a joint probability distribution 
program. 


Subroutine JNTPB 

Subroutine JNTPB is used for calculating the joint probability distribution 
of s x versus s y in equation (8) and the decision rule as x 2 + bSy 2 + cs x s y < 1 in 

equation (16). This program has a fixed storage allocation, but the shape of the joint 
distribution can be completely arbitrary because a search mode of operation is used 
rather than a table look-up procedure. This program will accept data of any dynamic 
range and output a scatter diagram bounded by the minimum and maximum values of 
the data, This program contains three subroutines (1) DJCOBI is an IBM library system 
subroutine for calculating the eigenvalues and eigenvectors used in the ellipse equation, 
(2) LABEL6 is used for labeling the axes of the scatter diagram, while subrouting 
PLTBF6 is specialized for use with the Stromberg-Carlson 4020 peripheral equipment to 
obtain microfilm copies of the boundary map, and (3) JNTPB also outputs the boundary 
tape for use in the second stage of processing. 


Subroutine CLASFY 

Subroutine CLASFY contains the second, third, and fourth stages of processing 
via subroutines TRUCK, SEQMRG, and CLASS, respectively This' subroutine also 
controls the number of classification passes desired and the size of the pxp cluster 
selection array via subroutine TRUCK. 
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Subroutine TRUCK 


The second stage of processing is subroutine TRUCK, which reads the boundary 
map tape as input data and locates the homogeneous areas that are large enough to 
contain a pxp array. This subroutine also performs the spatial merging when two 
different clusters collide. The output of this subroutine is a boundary map tape 
containing the location of the initial clusters, which is input to the third stage of 
processing. Subroutines LABELA and PLTBFA are used for labeling scan and column 
numbers and obtaining microfilm, respectively, for the visual display of the map obtained 
from subroutine TRUCK. 


Subroutine SEQMRG 

The third stage of processing is subroutine SEQMRG and uses the boundary and 
cluster location tape and the raw data tape as input. The boundary and cluster tape is 
used to locate and retrieve raw data belonging to each cluster Statistics utilizing 
equations (17) through (30) are then calculated for each cluster and used for deciding 
whether to spectrally merge clusters. The clusters are read in sequentially, and each new 
cluster has the opportunity of merging with any or several of the previous clusters. 
Subroutine SKRBIN is an IBM system subroutine and allows for skipping binary records 
on the magnetic tape. This routine is primarily used with the cluster selection since the 
clusters may be located anywhere on the tape. Subroutine FETCOR is used to retrieve 
and calculate the centroid of each cluster and to calculate the cross-channel correlations 
for each cluster. The means are subtracted from the correlations to form a covariance 
matrix in subroutine AMTRX, which becomes an input to the eigenvalue and eigenvector 
subroutine DJCOBI. Subroutine ROTA is a rotation matrix calculation used to rotate a 
feature vector into the coordinate system of a cluster Subroutine KCHECK is used to 
check whether the centroids of two clusters fall within each other’s ellipses or merge. If 
two clusters will merge, the statistics of both clusters are combined and updated. The 
output of SEQMRG is a tape containing the statistics for the final set of classes. 


Subroutine FETCOR 

In addition to retrieving and correlating the raw data for each cluster, subroutine 
FETCOR keeps track of the starting and stopping scan lines and columns of the clusters 
that are in the computer This information is used with the subroutine BSRECD, which is 
an IBM system library subroutine for backspacing records. The use of this subroutine 
allows for backspacing the tape to locate the data for the next cluster, rather than 
rewinding the entire tape and searching for the next cluster’s data. 


Subroutine CLASS 

The fourth stage of processing is subroutine CLASS and uses the tape containing 
the statistics for each class, the boundary tape, and the raw data tape as input. This 
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subroutine mainly contains equation (31) and the provisions for outputting the 
classification map on standard computer printout or microfilm. The classification map is 
also output on tape for use in subroutine CLASFY to obtain additional clusters and 
classification passes. 
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APPENDIX B. COMPUTER PROGRAM LISTINGS 


SUBROUTINE pwNdRb 

DIMENSION X ( 12 » 256 ) * V ( 1 2 * 2 56 ) . MCM AN ( 12 ) » NN ( 2 56 ) * KSYM { 49 ) » .JSVM ( 

I 256 ) 

DIMENSION NWHIOHI 1 ? ) 

NAMELIST/ lNPUT6/NSrANS»NSTART»NsPS»NCH»NVAR»NSYM,ISUM,NBTLG» 
lMODE»ITYPE».MSFC* .N5KIP,NBLK,INCX* INCY.NSTXtNSTYfNCRf 
NaMEL I ST /NCHUSF/NWH i oh 
FQUIVALFNOF (NSOAN,NSoaNS) 
fQUIVaLfNof (NSTRT,NSTaRT) 

FQUIVALFNOF ( NrOL »NSPS ) 

FQUIVALFNOF ( NOHAN » NCH ) 

I CARDs 5 
I PR I NT = 6 

I NT APE = 1 0 
I OT APE= 1 1 

RfaDI IOARd, INPUT 6 ) 

WRtTf ( I PRT NT » INPUTS ) 

RFADt I CARD * NOHUSF ) 

WR T TF ( IPR I .NT *NohUSF ) 

1 FORMAT ( 1 X * 7 1 4 ) 

3 FORMAT ( 1 X » 12 ID 

RFADt I 0 ARD » 5 ) ( KSYM ( I ) , 1 = 1 ,NSYM) 

6 FORMAT ( ) X * 60 A 1 ) 

NFLAGsO 

AVF=ISUM 

APOPrO.O 

DXAVF=0,0 

DYAVFsO.O 

D7AVF=0.n 

NSAVsNSOAN 

IF ( NS.K I P .FQ. 0) GO TO 98 

DO 9 7 I=1»NSKIP 

OALL SKRbIN(TNTAPF,1 ♦■NOP) 

07 CONTINUE 
OB CONTINUE 

2 FORMAT ( 1 HI ) 

4 FORMAT (5X. 111. ID 

II = 1 

KK=NSTRT- 1. 

160 IF{ 1 1 .FQ.NSOAN) 0-0 TO 510 
11=11+1 
NFLAG?=1 
KK=KK+1 

IFIII.NF.2) GO TO 290 
DO 170 JJ=i,NCOL 

CALL GET 6 ( X ( 1 » J J ) ♦ NoOL , 0 ♦ NOHAN »NSCANO » I NjaPF > I ERR >NFLAG2 »NstRT ♦ 

1 N R T L G » M 0 D F » N C Rf , I T Y P F , M S F 0 ) 

170 OONTINUF 
290 OONTINUF 
NFLAG2=1 

DO 300 JJ=1,NC0L 

call GET 6 ( Y ( 1 » JJ) »NoOL,0 *NohaN»NSOANO, iNtaPFj I eRR»NfLaG2 »nstRt» 

1 NDTLG ♦ MOdF »NoRF » I TYPE *MSFC ) 

300 CONTINUE 

DO 380 J J=2 »NCOL 
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IJ=JJ~1 

x5um=o.o 

YRIjMsO.O 

ZSUM=0.0 

no 360 I CHAN= 1 » I SUM 
rrCHN=NWHICH< irHAN) 
xn I FF=V ( I I CHN , J J ) -Y { T I CHN * I J ) 

YOT FF=Y ( 1 1 CHN » JJ ) ~X ( I ICHN.JJ ) 

XSUM=XSUM+XO I FP#XOT FF 
YSUM=YSUM + Yni FF*YDT FF 
Z SUM=Z SUM+Xn i F.F*YD T FF 
360 CONTINUF 

XSUM=XSUM/AVE 
Y5UM=YSUM/AVF 
Z SUM s Z SUM / A V F 
APOP=APOP+T .0 
A A- 1 . 0 /A POP 
ao=1 ,0-AA 

OXAVP=RR*nX4V r +A A^YSUM 
OY AVF=RR*nY AVF+A A*X$UM 
OZ AVF=RB*nZ AVF+A A*ZSUM 
XSUM=SQRT ( XSUM ) 

IF (XSUM .LT. 33.4) GO TO 366 
364 XSUM=53.0 

Y S UM = 33,0 
60 TO 366 
’63 roNTINUF 

Y SUM a SORT (YSUM) 

IF (YSUM .LT. 33.4) GO TO 366 
GO TO 364 
363 CONTINUF 

CALL JNtPR(YSUM,XSUM,NfLAG»0»'T,KSYM(3 ) ,NPOP, 
?nxAVF,nYAVF » 
anZAVP* 

1 JJ »NScaN , I I ,NrOL » JSYM,X ,NSTRT »NN, iNrXl 
NSCANsNS AV 
3 R 0 CONTINUF 

DO 500 J J = 1 »NC0L 
DO 450 ICHAN=1 , 1 SUM 
I TrHN=NWHICH( I ''HAN) 

X ( I ICHN, JJ ) =Y ( j irwN » J J ) 

4°0 CONTINUF 
300 rONTTNUF 
GO TO 160 
51 0 CONTINUF 

nflag=i 

call JNTP B ( YSUM * XSUM , NFL AG » 0 » o , KS YM ( 3 ) , NPOP , 
?DXAVF*DYAVF* 

3n?AVF, 

1 JJ > nscan » 1 1 »NrOL » JSYM » X » NSTRT » NN » I NcX ) 

RFWIND IOTAPF 
RFWIND I NT APB 
C CALL CLFA.N 

rfturn 

FNO 
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SUPROUTINF LABFLAINSTART, NSTOP * INCRf 
ntMFNSTON i OUT ( 1 ?0 ) 
MniF=(NSTOP-NSTART+l ) /INcRf 
1 1=0 

no 1 I =NST ART * NSTOP » I NrRF 

it=ii+t 

iouTt i n = 1 / 1.000 

1 CONTI NUF 

WRtTf ( 6 » 1 0 ) ( T 0 U T ( T ) » I = 1 » Nn T F > 

11=0 

no 2 I=NSTaRT»NSTOP,INCRF 
IT = T I + 1 

I0()T( T r ) = T /I 00-1/1^00*10 

2 CONTI NUF 

WR I T F (6*10) ( I OUT ( J ) * I = 1 , NO T F ) 

11=0 

no A I=NSTART , NSTOP , INCRF 
I T = M + 1. 

TOUT ( T I ) = 1 /I0 -T/ 1 nn*1 0 
COMTINUF 

WRTT^ (6*10) ( I0UT( T ) *1=1 >NniF) 

11=0 

00 4 T=NSTaRT .NSTOP, INCRF 

1 I = I T +1 

I OUT ( 1 1 )=I~ 1 / 10*10 

IF ( I OUT (IT) .LF. 0 1 T 0 U T ( T T ) - 0 

4 FONT I NUF 

WRTTr (6*10) ( T OUT ( I ) * I = 1 ♦ NO I F ) 

1.0 FORMAT ( 1 1 X » 1 20 1 1 ) 

RFTURN 

FNn 
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SURROUT INF JNTPp (haTAH , FIAT AV » NfL ag »MT X » M I V » ALPNi iM , NPOP , 
?OXAVf,OYAVf» 

30? AVF * 

1 J J * NSC AN * T Sr aN ,NC0L , I SYM ,N X , NSTRT , N.N » I NCXY ) 

OT MFNSION iNrXYM) 
n j m f M 5 1 0 N NP ( FA » F4 ) 
nf mfNFJ ON fiaTa (1 7 ) 
n T w F n S I ON T R I M ( ? f f ) 
r»I MFMS1 ON I SY V ( 1 ) , TOUlT.d 00) 
otv.fnSION NX(1 ) »NN( i ) 

n T MFNSI ON ALPNUM ( 1 ) , ALPHA < \ ?0 ) >cORnX ( 3 ) 

HOURLF P R f c 1 3 1 0 N A < ?»?)» F I f,fN( ? »2 ) 

I NTFGFR ALPNUM , ALPHA fRLA.NK 
n A t A ASTR IK/1 H#/ 

DATA XMARK / i HX / 

OAT A PLANK /6H / 

OATA NFLaf,4/0/ 

1060 FORMAT (48X »?5HRATA SWITCH HAS OCCURRFO 1 

1061 FORMAT ( 49X * 30HJO t NT PROBABILITY OISTRtrUTION ) 

106? FORMAT (i Hi) 

1063 FORMAT (44X»11h X-AXtS IS »T6»6X,llH Y-aXiS jS ,?T6) 

1066 FORMAT ( 30X ,6HOX AV<f= ,f!5 , 7 ,6moYAVR= »F15 , 7 ,6HRZ A\/R= > R 15 ,7 ) 

1040 FORMAT ( lH *67 h MaXtMUM PROBABILITY Of i iNfCMMON A t T TV f x ff rnpn- fONt 
1TNIJF FXfI TT ON , 2 T 6 ) 

1064 FORMAT ( IX »26hSYMP0l N/SYMrOL ) 

1065 f0RMaT(11X*12K1h*) , / » 1 IX ■, 1h*»5 5X * 5hP aRT .U,4h Of , t 1 ,5 3X» 1h* 

1 >1 1 X * 1 ?i ( 1 H* ) ) 

IF (NFLA04 .OT, 0) r,0 TO BO 

NF(.G-3 = 0 

1 OOP FORMAT ( l X ,4 7 Ai ) 

Nf|.G = 0 
N T = ? 

I RW= i 
NFLGFN=0 
I R T = l 1 

00 1 1=1 ,*4 

00 1 J = i , =4 
NP ( T , J ) =0 

l FONT T N'JF 

R F W I M O TRT 
RfWINO IRW 
NFL AG4= 1 
80 CONTINUF 

IF (NFLAF-.GT. 0) GO TO 13 

NF = DA T AV+1 * 5 

NR=0ATAH+1 .F 

if (Nr , L t , i ) nc=i 

IF (NR ,LT. i) NR=l 

NP(NR»NC)=NP(NR*NC)+l 

1 a F4* ( NC-1 ) +NR 
I RT N ( J J ) = I 

IF (JJ ,LT. NCOL ) ro TO 15 

I P T N ( 1 ) = T R I N { ? ) 

WR T T F ( IRW) ( IR?N( T T i ,TI=1 » NrOL ) 

15 FONT I NUF 
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rfturn 

16 CONTINUF 

17 CONTINUF 
RFWIND IRW 
I OP T = 1 

I N = ? 

IM=? 

RhO= 1,0/ ( 1 0,0** 6 ) 

411)1) =nXAVF 
A ( 7 * 7 ) = DY AVF 
AM »7)=0ZAVf 

A ( 7 * 1 ) =OZ AVF 

CALL DJcORI ( a » I M , I N , I OPT , RhO » FRR » f I CFN ) 
r WR T Tf ( 6 > 1 067 ) A ( 1 » 1 ) * A ( 1 *2 ) ,fi.gfN ( 1,1) ,FIGFN (1,2) 

C WR T Tf ( 6, 1 067 ) A ( 2 * 1 ) , A ( 2 ,2 ) , F T GfN ( 2,1) » F I G F N ( 2 » 2 1 

1067 FORM aT ( IX , 2Fl 5 . 7 , 10X » 2Fl 5 ,7 ) 

DX A VF= 1 , 0/ A ( 1 ,1 ) 
ny a Vf=1 , O/A (2*2) 

A ( 1 » 1 )=FTGFN( 1 *1 )*nXAVF 
A ( 1 *2)-FTGFN(2»l)*nXAVF 
A (7,1 ) = F I G c N ( 1 ♦ 2 ) *0 Y AVf 

A ( 7 , 7 ) =FTC)FN ( 2 » 2 ) #p i'YA VF 

nXAVF=FTr-FN ( 1 , 1 ) *a (i , l ) +figfN( 1,2 )* A ( 2, 1 ) 

OYAVF-FIGFN ( 7 » 1 ) * A ( 1 » 7 ) + F I G F |\| ( 2 * 2 ) * A ( 2 * 2 ) 

DZ AVF=FTGFN (1,1 ) * A ( 1 » 2 1 + f I c f N ( 1 » 2 ) * A ( 2 » 2 1 +F I G- N ( 2 * 1 ) * A ( 1 » 1 ) 
1+ftgfN ( 7,2 ) *A ( 7 * 1 ) 

WR T T F (6*1 066 ) nXAVF ,DY aVF , nZAVF 
I=g 

DO 160 NC- 1 * 64 
DO 160 NR=1»64 
I = T + 1 

TF ( NP ( NR *NC ) .Fa, 0) GO TO 1 GO 
XXX=NR*NR 

yyy=nc*nc 

ZZZ =NR*NC 

sum=dxavf*xxx+dyavf*yyy 
1 +D7AVF*Z zz 

IF(SUM.GF.1.0)GO to 116 

NX ( 1 1=0 
GO TO 1 GO 
1 1 g NX ( I ) =- 1 
1G0 CONTINUF 

WR I Tf (6*1.067) 

WR t TF ( 6 * 1 0&1 ) 

WR I TF ( 6 , 1 066 ) DX AVF , DY A VF , DZ AVF 
WRITF(6M 064) 

C CALCIILATF TABLF 

MAXKNT=0 
DO 2 7 NC = 1 * 6G 
DO 2? NR=1 , 6G 

IF ( NP ( NR » NC ) .GT. MAXKNT) MAXKNT = NP ( NR * NC ) 

22 CONTINUF 

IF (MAXKNT ,LT, 46 ) MAXKNT=46 

NFACT=MAXKNT/46 

XXX=FLOaT (MAXKNT ) 746.0 
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NFAC=0 

IF ( N F A r T .LT. 1) N F A C T = .1 
WR T TF ( 6 » 1 05 0 ) R L A N K , N F AC 
NFAC=NFAC +NFAFT 

00 1 = 1 ,46 

WR T Tp (6 » 1 060 ) aLPMUM( T) »NFAC 
1060 FORMAT ( 6 X » A 6 > 6 X * 1 6 ) 

NFA<" = NFAC +NFATT 
98 CONTINUE 

WRtTf(6,i Of? ) 

C PRINT DISTRIBUTION ON PAGE 

CORDX ( 1 ) =0.0 
C0ROX ( 2 ) =C0R0X ( 1 } +60 . 0 
rOROX ( 0 ) =cORoX < 11 + i i 0.0 
1 0?1 FORMAT (1 HI ) 

no 66 IF NO = 1 , 6 4 

Nr=66-IFNO 
DO 66 I=i,54 
ALPHA ( I ) = BLANK. 

66 CONTINUE 

IF ( NFLG .FQ. 1) GO TO 60 
DO 68 I=i,54 
IrtN{ I )= o 

66 CONTINUE 
60 CONTINUF 

no 64 NR=1. ,64 
XXsFLOAT ( N P ( N R » N C ) ) 

IF (NFLG .FO. 1) GO TO 91 

1 PIN (NR )=NP(NR,NC) 

91 CONTINUF 

irAR=XX+(l. 001-1 .O/XXX) 

IF (ICAR . GT , 46 1 T r aR= 46 
ALPHA ( NR 1= ALPNU M ( IfAR) 

64 CONTINUF 

CORDV=FLOAT (NC) 
ymarg=xmark 

IF (NC .NF. 54-IFNo/I 0*10 ) YMARG=ASTRlK 
WR I Tf ( 6 * 1 008 ) rORDY , YM aRG , ( ALPHA ( 11,1=1,54) 
1008 FORMAT) 1 X , F 8 , 1 , 2 X , A 1 , 1 ?0 A 1 1 
66 CONTINUF 

WR I T F ( 6 * 1 0 1. 0 ) 

WR r Tf ( 6 , 10 1 1 ) rORDX 1 1 ) ,rORnX ( 2 ) >c0RnX (3 ) 
1011 FORMAT (6X,F10.4,50X,F10.4,40X,F10.4) 

NFLG=1 

WRlTF ( 6 , 1 06? 1 
N6HB=1 
LWFR=i 
LOW=NSTRT 
706 CONTINUF 

NHI=L0W+1 20-1 

NUPPFR=LWER+120-1 

IF (NUPPFR .GT. NCOL ) NUPPER=NCOL 

WRITE (6, 1062) 

CALL L APFL6 ( LOW , NH i , 1 ) 

DO 19) 1 1 =N I , NSCAN 
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RFAn(tRW) ( TPTM(JJ) * J J = 1 » NFOI ) 

00 1*5 JJ = 1 » NFOL 
I FMFFK= I R I N ( J J ) 

JFHFFK=NX ( ITHFCK ) 

IF (JCHFfK ,N c . 0) r-0 TO 117 
ISYM ( JJ ) = ALPNUM ( iMS'JP-1 ) 

NN ( JJ.) =0 
GO TO IGF 

117 1 GYM t JJ ) =ALPNUM ( NSlJP ) 

NN ( J J ) =-i 

IGF rONTlNUF 

IF (NFLGFN .NF, 0) GO TO 1G6 
WR I TF ( IRT ) (NN ( JJ) ,JJ = i ,NCOL ) 

GALL PLTPF6 ( I SYM »NrOL »Nnt:< » iNcXY ( 1 ) , iNrXY( ? ) > I NrXY ( G ) > 
1 TNrXY ( A) * NcRf ) 

IGA FONT I NUF 

WR I Tf ( 6 » 1 0G6 ) I I » ( T GYM ( J J ) , JJ = LWfR ,NUPPFR ) 

1036 FORM AT ( 5X . 1 6 , 1 20 a 1 ) 

] OGF FORMAT (IX ,16 ) 

1 31 FONT I NUF 
NF|_GFN=1 

rfwind IRW 

L WFR = NUPP FR + 1 
L0W=NHI+1 

IF (NUPPFR ,LT. N F OL ) GO TO 70F 
9QF FONT I NUF 
I Ol 0 FORMAT 

NFLGFN=0 

RFTtjRN 

FMO 


68 



.SUBROUTINE CLASSY 

rOMM.ON /LARi/XPAR( 43 »l?) , S TC-Ma ( 43 , 1 2 ) ,ROT ( 43, 12 ,12 ) 
rOMMON /LA 4.2 /X ( 12) , aLPha ( 49) , NSPS ,N 5 r aNS ♦ N.r H AN ,LT9 ,Lt 1 p ,LT1 1 ,LT ] 2 * 
1 L T 1 3 ., LT 1 , T X XX » I Y Y Y , 

1NSTART,NST0P, 

1 NpTLG»MOpP, T TYPF,,Vcrr , f 4 , NcR E * 

1 N.SK IP, INCX * INCY.NSTX »NSTY 
N . 4 M ELI S T / P A SS E S / N P A SS ,NCL U ST 

NaMfLtST/ INPUT a /NSPS ,NSr aNS, VhaN,LT'1 »LT9 , LTIO ,LT1 1 ,LT12 , : LT13, 

1 N S T A R T » N S T OP , N p T L r » MO n r » r T Y P p » Ms F r , t 4 ,NcRr v Y$KrP , 1 Nr * , 7 Nf Y ,NSTX » 
?NSTY > t XXX , I YYY 

rf A n ( 5 » Passes > 

RPAP( 5 > INPUT A ) 

WRITE ( 6 * T-NPUTA ) 

Rfao ( 5 » 1006 ) ( ALPHA ( I ) * T =1 *48 ) 

1 0<AA FORMAT ( 1 X ,fiPA1 ) 

KOl'NTaNCLUST 

NSFANS=N SCANS” 1 

INI TrL-NcLUST + 1 

00 1. I = 1 »NPASS 

CALL TRUfK ( N rLUST »NPA SS ) 

CALL SFQMRG { NrL'JST > KOUNT * I NI TCL ) 

CALL CLASS (COUNT , I, NPASS ) 
nfujst = count 

T NT TCL-KOUNT+1 
1 CONTTN'JF 
RfTURN 
FNO 
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SURROUT I NF TRUCK (NrrNT.NPASS ) 

DIMENSION NNaCC< 12,256) *MTaR ( 1 1 ) » IPRT (256 ) »lPLOT(256) 

DIMENSION NTBL(400) 

COMMON /LAR1/XraR(43»12) ,S I gMa < 43 » 1 2 ) »ROT ( 43 » 1 2 * 12 ) 
rOMMON /LAP2/X ( 12) ,NSYM(49) , NSPS »NSCAN$ » NchaN , LT9 » LT 10 ,LT 1 1 , i-Tl 2 » 
1LT13»LT1»IXXX»TYYY, 

1 nstart.nstop, 

1NpTLG*M0DF» ITYPF*MSPC*I4,NCRE* 

INSKIP. INcX»INcY.NSTX»NSTY 
NFLGXX=p 
NFLAGX=0 
RFWIND LftL 1 
RFWIND LT 1 
NFLAGl =0 
MFINsO 

iXTY=lXXX*!YYY 
DO 10 1=1 » I YYY 
M T A R ( I ) = I 
]0 CONTINUE 

DO 50 1=1,400 
NTRL(I)=I 
50 CONTI NUF 

DO 11 1=1,1 YYY 

RFADILTII ) (NNACC< T ,JJ) , JJ = 1 »NSPS) 

MFIN=MFIN+1 
11 CONTINUE 

NuP=NSPS-I XXX+1 
ncnt=nccnt+i 

NFL AG=0 
?00 CONT I NUF 

I T t=MTAR (1 ) 

DO no JJ = 1,NSPS 

IF (JJ ,GT. NUP) GO TO 102 

I J = JJ 

jr=jj+ixxx-i 

N7FR0=0 
IKNT=0 
I SlJM = 0 
Jl J=MTAR(1 ) 

NTFMP= NCCNT+1 
DO 101 I = I J » J I 
DO 100 Jl J=1 , I YYY 
II J=MTAB(JI J) 

IF ( NNACC ( I I J , I ) .LF. NCCNT .AND. NNACC<IIJ»I> .NE. 0) GO TO 102 
IF (N.NA-rr ( 1 1 J, I ) ) 102,107,106 

106 IF ( NNACC t I I J » I ) .GT. NTEMP ) NTFMP=NN4CC ( 1 1 J , I ) 

GO TO 100 

107 NZFR0=NZER0+1 

100 CONT I NUF 

101 continue 

IF (NZERO .NE. IXIY) GO TO 105 
DO 103 I = I J , J I 
DO 104 J I J= 1 , I YYY 
NNACC ( J I J » 1 1 =NcNT 
104 CONTINUE 
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108 CONTI NUF 

NCNT=NCNT+1 

IF (NCNT .GT. 400) GO TO 999 
GO TO IT n 
1 Oc CONTINUE 

DO 108 1=1 J»JI 
DO 108 J I J= 1 > I YYY 

IF (N.NACCC JIJ*I) .FQ, 0) NNACCt JIJ.I )=NTFMP 
108 CONTI NUF 
GO TO MO 
10? CONTINUE 

110 CONTINUE 

DO 111 JJ-l.NSPS 

?F (JJ .FQ. NSPS ) GO TO Ml 

IF (NNACCt III .JJ) .IF. NCCNT) GO TO 111 

IF (NNACC ( I II .JJ + 1 ) .LF. NCCNT) GO TO 111 

IF (NNACC < I IT » JJ) .LF. 0) GO TO Ml 

IF (NNACCt I I I > JJ + 1 ) .LF. 0 ) GO TO Ml 

IF (NNACC ( I 1 1 * JJ) .FQ. NNACC (III > JJ+1 ) ) GO TO 111 

IJ = NNACC(MI.JJ) 

Jl = NNACC ( I I I * JJ+1 ) 

IF (JI ,GT . 400 .OR, U .GT. 400) GO TO 111 
IF (NTBL(JI) .GT. NTRL(TJ)) GO TO 125 
NTrL( I J ) = N TrL ( J I ) 

GO TO 111 
125 CONT I NUF 

NTRL( JI )=NTRL( I J) 

111 CONTINUE 

10 07 FORMAT MX. 16) 

WRlTF (LTi ) (NNarr ( M I. JJ ) » JJ=1 .NSPS ) 

IF (VEIN .GF, MSCANS) GO TO R99 
I Y T =MT A8 ( 1 ) 

RFADtLTll ) (NNACCt TYI , JJ) » JJ = 1 .NSPS) 

MFIN=MFIN+1 

ntfmp=mtab ( 1 ) 

IYY= IYYY-1 
DO 1?1 1=1 » IYY 
MTAR ( I ) =MTAR( 1+1 ) 

121 CONT I NUF 

MTAR ( I YYY ) =NTRMP 
GO TO ?00 
99° CONT I NUF 

DO 122 I =2 » I YYY 

MI =mtab ( I ) 

DO M? JJ = 1 .NSPS 

IF (JJ , FQ . NSPS ) GO TO 117 

IF < NNACCt It I .JJ) .LF. NCCNT) GO TO 112 

IF (NNACC ( II I » JJ + M .LF. NCCNT) GO TO 1.12 

IF (NNACC ( I I I . JJ) .IE. 0) GO TO 112 

IF (NNACC (III .JJ+1 ) .LF. 0) GO TO 112 

IF (NNACCt I II »JJ) .EQ, NNACC < II I » JJ+1 ) ) GO TO 112 

I Jr NNACC (III . JJ) 

J I r NNACC ( I I I .JJ + 1 ) 

IE (JI .GT. 400 .OR. TJ .GT. 400) GO TO 112 
IF (NTBL(JI ) .GT. NTRL ( I J ) ) GO TO 126 
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NT f>L ( I J ) = NTrL ( J T ) 

GO TO 112 
126 CONTINUF 

NTBLCJI )=NT8L( IJ) 

11? CONTINUF 

WR I TF ( LT 1 ) CNN ACC ( 1 1 1 » JJ ) » JJ = 1 »NSPS ) 
12? CONTINUF 

FNP FILF LTl 
R-FWIND LTl 
RFWIND LT 1 1 
RFWIND LTl? 

WRlTF(6.l007) (NTPL ( I ) » 1 = 1 » 400 ) 

00 ITS 1=1,400 

TP (NTRL (I) ,FQ. 1) GO TO m 
J T = 1+1 

IP < NTPL ( J I ) .NF, T) GO TO H4 
NTPL ( J I )=NTPL ( I ) 

114 CONTINUF 
11 2 CONTINUF 
II = 1 

NTFMPsI'I 
00 H6 1=7,400 

IP (NTPL ( T )-NTPL ( T-l n 117,118,119 
119 IP (NTRL(T) .NF. I) GO TO 117 
N T R L ( 1-1 )=NTFMP 
IT=I 1+1 
NtfMP= I I 
GO TO 116 

118 NTRL ( 1-1 ) =NTF'MP 
GO TO 116 
1 1 1 N = N T P L ( T ) 

NTPL { 1-1 ) =NTFMP 
NTFMP = NTRL < N ) 

116 CONTINUF 

NTRL ( 400 )=NTFMP 

WR I T F ( 6 » 1 00 ? ) ( NTp,L ( I ) ,1 = 1 ,400) 
LWFRrl 
LOV=N START 
706 rONTTNUF 

N U PP F R = L W FR +12 0-1 
NHT=LOW+1 ? 0-1 

IF (NUPPFR .gt. NSPS ) NUPPfR=NSPS 
mi f=nuppfr-lwfr+i 

WR I TF ( 6 ,1005 ) 

1008 FORMAT (1 HI) 

call LARFLA (LCW»NhI ,1 ) 

DO 710 IT =1 »NSCANS 

Rpad( LT 1 i (NNAcr (1, JJ) ,JJ=1 ,NSPS) 

DO 115 JJ=1»NSPS 
Ir=NNaCC(1» JJ) 

IF (IP .LF. 0) GO TO 115 
NNACCU »JJ)=NTBL( IR) 

116 CONTINUF 

IF (NFLGXX .GT. 0) GO TO 1?7 

WRt Tp ( LTl? ) (NNacC ( 1 , JJ ) » JJ = 1_ , N S P S ) 
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1 71 rOMTlMUf? 

Jl=0 

DO 711 JJ=LWFR,NUPPFR 
J1 =JI+1 

N = N N 4 C C ( 1 >JJ) -(NNArrn + ? 

IPRT( JJ)=NSYM(N) 

I PLOT ( J I )=N5YM(N) 

711 rOMTlNUF 

WR i T.e ( 6 » 1 003 ) T J * ( T PRT ( J J } * J J-LWFR ,NuPPfR ) 

CALL PLT FFA ( I PLOT * I D I F *NFLK , J NCX , I NcY »NSTX »NSTY » 
lNrRF*NFLAGX»MFLAGl ) 

10 03 FO R M A T ( 4X » I 6 . lH* » 1 7 0 A 1 ) 

710 CONT I NUF 

RFWIND LT 1 
NFLAGX=0 
NfLGXX=1 
nflagi =0 
LWFR,?MUPPFR + 1 
LOW=NH I +1 

IF (NUPPFR .LT. NSPS ) GO TO 703 
370 CONTI NUF 

NCCNT a'NCNT-i 
FNO FI LF LT 1 7 
RFWIND LT1P 
RFWIND LT 1 
IXXX=IXXX-4 
I YYY=I YYY-4 
LTi 1=17 
RFTURN 
FNn 
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SUBROUTING- SfQMRg ( NrLUST , KOUNT » I N I TCL ) 

rOMMON /LAB1 /XpaR<43,12) , S I gMa ( 43 , 12 ) ,ROT (43 , 12 , 12 ) 

COMMON /LAR2/X< 12) , ALPHA (49) , NSPS vNSCANS , NfH AN , LT9 , LTlO ,LT 1 1 * LT 12 , 
1LT1 3 »LT1 » 1 XXX » I YYY » 

INSTART *NSTOP, 

1 N r T L G » MOD F» ITYPf , M S F C » 1 4 » N CR F » 

1 N SK I P , I N C X.» I N C Y » N S T X , N S T Y 
nOUPLF P RFC IS I ON A ( 1 2 » 12 ) » FTGFN ( 12 » 12 ) 

DIMFMSION MFRGP ( 2°° ) »MPOP(200) »NFXFC(20) » C ( 43 » 78 ) » R (1 2 * 1 2 ) 
n I M FN S I ON COM ( ?4 ) 

FQU I V A LF NC P ( COM ( 1 ) * NS P S ) 

1000 FORMAT ( IX » 16 » 1 2 F 1 0 * 3 ) 

1001 FORMAT (1X,4HXRAR ) 

1002 FORMAT ( IX »16HDID NOT CONVFRg ) 

1003 FORMAT ( IX *7HICLUST= » I6*14hMfRGF( ICLUST )= ,16) 

1004 FORMAT (1X,5hRhO= , Fl 5 , 7 , 5 hfRR= ,f15.7) 

1005 FORMAT ( 1 X , 1 2 F 1 P # 4 ) 

1006 FORMAT (IX *1216) 

1007 FORMAT nX,23HMFRGTNG WILL TAKF PLACE ) 

1008 FORMAT (TH ) 

1009 FORMAT (13H COV, MATRIX ) 

1010 FORMAT (I2H NORM FIGFN ) 

I On FORMAT ( 1 8H p. A. COV. Matrix ) 

1012 FORMAT ( lH »6HASUM= , Fl 5 . 7 ,7 hfLUSTfR , 1 4 1 

1013 FORMATt IX ,28 hXraR( I ,J ) , J=1 ,12) » I = 1,K0UNT ) 

1014 FORMAT ( IX »29HSIGMa ( T , J ) ,J = 1 ,12 ) , T-l , KOUNT ) 

1015 FORMAT ( IX ,55HROT( I , I CHAN , JCH AN ) , JCHAN = 1 ,12) , ICHAN = 1, 12 ) , 1 = 1 ,KOllNT 
1 ) ) 

1016 FORMAT (IX, 16,(1 2F'l 0 , 3 ) > 

NFLG=0 

CZFCH=FLOaT(NCHAN)-2.0 

RFWIND LT 1 0 

REWIND LT 1 2 

RHO=i .0/ ( 10, 0**5 ) 

if (NSKIP .fq. 0) GO TO 6 

DO 7 1=1 , NSKIP 

CALL SKRrIN(LTiO,1,NOP) 

7 CONTINUF 

6 CONTINUF 

DO 5 I CLUST = 1 »NCLU C T 
MfRGFI ICLUST )=I CLUST 
5 CONTINUE 

im=nchan 

IN= 1 M 
I OPT= 1 

DO 10 ICLUST=If^ITCL,NCLUST 
IF (NFLG ,C-T. 0) GO TO 11 
IF (KOUNT ,GF. 43 ) GO TO fl 
KOI INT =KOUNT + 1 
IFLAGsKOUNT 

CALL FPTCOR ( TFLAG, C* MPOP , NFLG > I N I TOL ) 

WR I Tf ( 6 , 1 00 1 ) 

WR I TF ( 6 , 1 000 ) I FLAG, ( XRAR ( I FLAG, I ) , 1 = 1 , 12 ) 

MI =1 
M J= 1 ? 
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M < =1 ? 

nO 500 MM=1,12 

WR T TF ( 6 » 1 000 ) TFL AG, (C (I FLAG, MR ) ,MR=Ml ,'MJ ) 

MI =M J+l 
MJ=MJ+MK-MM 
500 CONTI NUF 

CALL AMTRX ( IFLAG»XRAR,C>A ,NCHAN) 

WRl TF ( 6,i 0 08 ) 

WR T TF (6,1 0 0 9 ) 

WR r Tf ( 6 * 1 005 ) ( ( A (MT ,MJ) ,MJ = 1 ,12 ) »Mj = l ,1.21 

CALL nJCORI ( A,IM,IN,I0 PT»RhO,FRR,FIGFN) 

WR i Tf ( 6 » 1 004 ) RHO» FRR 

V/R T Tc ( 6 , 1005 ) ( ( a(M T »MJ ) » MJ = 1 , 1 2 ) *Ml = 1 * 1 2 ) 

WR I TF ( 6 , 1 008 ) 

WR T Tf ( 6 » 1 005 ) ( < F TGFN ( Ml , M J ) ,M J = 1. , 12 ) *Ml =1 , 1.2 ) 

IF ( FRR ,fQ. 0.0) r-o TO 15 

MFRGF(ICLUST)=0 

kount=kount-i 
WR i Tf ( 6 » 1 002 ) 

GO TO 10 
15 CONTI NUF 

CALL ROTA ( I FL AG * ROT * F I GEN * NCHAN » A * S I GMA ) 

MFRGEI ICLUST ) =KOUNT 

WR T T r ( 6 * i 009 i i cLUS T *M fRgf ( I r LUST ) 

MPOP ( KOUNT ) =MPOP ( ICLUST) 

IF (KOUNT ,FQ. 1) GO TO 10 
mclust-kount-i 
DO 20 I CHECK = 1 » ?0 
NFXFC ( ICHFCK ) =0 
20 CONTINUE 
MCHECK=1 

DO 25 JCLUST=I ,NCLUST 
I f ( mfRgf ( JCLUST ) . LT .MCHFCK ) GO TO 25 
MCRFr K = MCHFCK + 1 

TF I MERGE (JCLUST) .FQ. KOUNT) GO TO 26 

jflag*mfrgf( JCLUST ) 

DO 30 ICHAN=1 ,NCHAN 

X ( I CHAN ) =XRAR ( JFLAG , I CH AN ) -XR \R ( KOUNT , 1CHAN ) 

SO CONTINUE 

IFLAGsKOUNT 

CALL KCHFCKI TFLAG* ROT,X,SIGMA»ASUM,NchaN) 

WR T T F < 6 * 1 0 0 8 ) 

WR I T f ( 6 * 1 0 1 2 ) a SUM* JfLAG 

IF ( A SUM .gT. FZFCH) GO TO 25 

I FLAG* JFLAG 

CALL KCHFC.KI IFLAG* ROT,X,SIGMA*ASUM,NchaN ) 

WRITF(6»1 008 ) 

WR I TF ( 6 * 1 0 1 ? ) A SUM , J FL AG 
IF ( a SUM , GT. CZECH ) CO TO 25 
NFXFC (1 )=NFXfc( 1)+1 
NSUR=NFXFC ( 1 )+l 
N F X F C ( N SU R ) = JFL a G 

25 CONTINUE 

26 IF (NfXfc (1) .FQ, 0) GO TO 10 
DO 501 KK = 1 * N S U B 
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WRTTF<6»i006 ) KK,NfxFC(KK) 

FOi CONTINUE 

MSUPsNfXFC (1 )+1 
TOT AL=MPOP ( KOUNT ) 

DO 3) I RUN=2 »M$UP 
NSURsNFXFC .( I RUN ) 

FI IMsMPOP { NSUB ) 

TOT AL = TOT AL+SUM 
pi CONTINUE 
I NUMssO 

OFN=MPOP( KOUNT) 

00 PR ICHAN=1 * NTH AN 

X ( I CM AN ) =XRAR ( KOUNT * ICHAN ) *OFN/TOT AL 

00 AO jchan=ichan »nchan 
I NUM=I NUK+1 

R ( ICHAN, JCH AN ) =C (KOUNT , I NUN) *OFN /TOTAL 
40 CON T I Nil F 

PP CONTINUE 

nO 4 C IRUN=?,MSUR 
NSUR=NFXFC < I RUN) 

INUMsO 

OFN=MPOP (NSUR ) 

DO 50 ICHAN = ) ,NCHAN 

X ( T CH AN ) =X ( I THAN ) +X n A R ( NSUR » TCHAN ) *nFN /TOTAL 
nO PR JrHAN=ICHAN»NFHAN 

1 NUM = I NUM+ 1 

R ( I CHAN » JCH AN ) =R ( I OMAN » JCH AN ) +C ( NSUB , I NUM ) *OFN /TOTAL 

5 P CONTINUE 

50 CONTI N U F 

45 CONTI NUF 

00 60 I CHAN = 1 » NCH AN 
nO 65 JCHAN=ICHAN »NcHAM 

A ( I CHAN * JCHAN ) =R ( I THAN > JCH AN ) -X ( I OH AN ) *X ( JCHAN ) 

A ( JCHAN , TCHAN) =A( I<"maN, JCHAN ) 

6 P CONTINUP 
60 CONTlNJr 

WR T TF ( 6 * 1 009 ) 

WR T TF ( 6 * 1 005 ) ( ( A ( M T » M J ) »MJ=1,12) » M I = 1 , 1 2 ) 

CALL DJCORI ( A * IM, IN , IOPT , RHO , FRR » F I GFN ) 

WR I T F ( 6 » ) 0 04 ) RHO » FRR 

WRtTf ( 6 > ) OOP ) ( ( A (M T *MJ ) »MJ=I , ].? ) *Ml = 1 » IP ) 

WR I TF ( 6 * 1 008 ) 

WR T TF ( 6 » 1005 ) ( ( F I GFN (MI ,MJ) ,MJ = 1 ,12) ,.MI=1 » 12 ) 

IF (fRR ,NF. 0.0) GO TO 10 
WR I T F ( 6 » 1 007) 

IFLAG=NFXFC(?) 

MPOP ( I FL AG ) =TOT AL 
INUMSO 

nO 70 ICHAN = i ,NCHAN 
XRAR( I FLAG* ICHAN)=X( IFHAN) 

00 75 JCHAM = tcman *mfhan 
I NUM= INUM+ l 

C ( IFLAG, INUM)=B< ICHAN.JCHAN) 

75 CONTINUE 
70 CONTI NUF 


76 



CALL ROTa MFLAG>R0T»FTGFN,N''HAN,A,SIGMA) 
00 RO JCLUST = 1 *NCLUST 

00 8 c irijm = ?,mrub 

N$UB=NFXFr{ I RUN ) 

IF ( MFRGF ( JrLUST ) .NF. NSU« ) GO TO 85 
MERGE < JCLUST ) = I FLAG 
85 CONTINUE 
80 CONTINUE 

M FRGF ( ICLUST ) = TEL AG 
I F ( NFXFC ( 1 l.EQ.I ) GO TO 94 

1 8'JaO 
JCWFOKrl 

00 90 JCLUST = 1 »NcLUST 
IOUMsMFRGEf JCLUST) 

91 if( MFRGF (JCLUST >. LT.JCHFCKICO TO 90 
IFlMERGFi JCLUST) .GT.JCHFCKIGO TO 92 
IF (I SW.FQ . 1 )CvO TO oq 

JCMECK= JrHFrK+I 

IF ( Jr H rri< .FO.KOUMT ) ro TO 94 

GO TO 90 

92 MfRGf ( JCLUST ) =MFRGF ( J cLUST ) -1 
ISW=1 

GO TO 91 

93 IF ( JCHF^K .gT, KOI INT ) GO TC 94 

1 c i,i - n 

I Nl lM-0 

MP.O.P ( JCHFCK ) = MPOP ( T DUM ) 
nn 98 ICHAN=1 jNCHAN 

XBAR ( JCHECK * I CHAN ) =XB AR ( I DUM » I CHAN ) 

SIGMA! JCHECK* I CHAN) =SIGMA< I DUM, I CHAN) 

DO 100 JCH AN = I CHAN »NCH AN 
I NUMr INUM+1 


r{ JCwfcK , lNUM)=c ( ToUM,tN!JM) 

ROT ( JCHFCK * I CHAN * JCHAN ) =ROT ( IDJJM* I CHAN* JCH AN ) 
ROT ( JCHFCK » JCHAN » I ''HAN ) =ROT ( T DUM* JCH AN , I CHAN ) 
100 CONTINUE 
95 CONTINUE 

DO 96 LCLUST-I ,NCLUST 
I F( MfRGF ( LCLUS T ) . Nf . I DUM ) GO TO 96 
MFRGF (LCLUST ) = JCHFr K 
06 CONTINUE 

JCHFCK= JGHECK+1 


90 

10 
1 1 


CONTINUE 

K0UNT=K0UNT-NFXFC(1 

CONTINUE 

CONTINUE 

W.R I TF ( LT 9 ) (COM(T), 
WR T Tf ( LT 9 ) ( ( X raR ( I 

WR T Tf ( LTo ) ( ( S TGMa ( 

WR T Tf ( LT9 ) ( ( ( ROT ( r 


) 


1=1*24) 

* J ) * 1 = 1 * KOUNT ) , J = 1 ,1 2 ) 

T » J) * 1 = 1 * KOUNT ) , J=i , 12) 
* I CHAN * J<~H4N ) » 1 = 1 , KOUNT 


1 JCHAN= 1 *NCHAN ) 


T ehaN = 1 ,NchaN ) , 


WR I T f ( 6 , i 0 1 3 ) 

DO 510 1 = 1 » KOUNT 

WR T Tf ( 6 * 1 000 ) t*(XraR(I>J)*J= 1 > 12 ) 
510 CONTINUE 
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WR I TF ( 6 »1 014) 

DO 5H 1=1 » KOUNT 

WRtTf ( 6 » 1000 ) i , < SIGMa ( I ,J ) , J = 1 , 12 ) 

511 CONTINUE 

WR I T F,( 6 » 1 0-15 ) 

DO 512 1=1* KOUNT 

WR/T Tf(6»1016) I >< (ROT ( I .ICHAN .JCHAN) , JCHaN= 1 * 12 ) » ICH AN = 1 * 12 ) 

512 CONTINUE 

DO 515 I=1»NCLUST 

IF (MfRGF t I ) .GT.KOUNT jr-0 TO 514 

WR T Tf ( 6 » 51 5 H ,MFRGF ( I ) 

5 1 5 FORMAT ( IX * 7HCLUSTFR , 1 4 , 1 X , 5hcLaSS * T 4 ) 

515 CONTINUE 
514 CONTINUE 

DO 660 1=1* KOUNT 
DO 620- ICHAN=1 ,NCHAN 
DO 610 JCHAN = 1 ,NCHAN 

B ( I CHAN * JCHAN ) =R0T ( I ,JCH4N *ICHAN ) /SIGMA ( I , ICHAN) 

610 CONTINUE 
6?0 CONTINUE 

DO 650 ICHAN=1 >NCHAN 
DO 640 KCHAN- 1 * NCHAN 

5 HM=n ,0 

DO 650 JCHAN = 1 , NCHAN 

SUM=SUM+ROT ( T *1 CHAN » JCHAN ) *D ( JCH AN , KCHAN) 

650 CONTINUE 

A ( ICHAN * KCHAN ) = SUM 
640 CONTINUE 
650 CONTINUE 

WR I Tf (6*600) I 

600 FORMAT ( IX * 15HCLASS ELLIPSE* I 4) 

WR T Tf ( 6 » 1005 ) ( ( A ( T A » J A ) »Ja=1 » NCHAN ) * I A = 1 hVhaN ) 

660 CONTINUE 

REWIND L T9 

RETURN 

END 
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SURROUTiNf FFTrOR ( tfLAG*C >NPOP ,NfLG ,N ) 

COMMON /LAR1/XraR(43»12) * SI GMA (43*12) ,R0t ( 43 , 12 , 12 )’ 

COMMON /LAR2/XH2) , aLPhA ( 49 ) ,NSPS .NSCANS ,NchaN ,LT 9, LT10 ,LTl 1 » LT12 , 
1LT13»LT1»IX»IY* 

1NSTART, NSTOP, 

1 NrTLG »MOdf * iTYPf»MSFC»I4*NcRF* 

1 NSK I P > I NrX * TNcY.NSTX jN-STY 
DIMENSION NPOP M ) *0(43*78 ) ,NPAT { 233 ) 

DATA NC'NT/0/ 

INUMsO 
N.FLG1 =0 
NFLGRsO 

DO 5 I CHAN=1 * NcHAN 
XraR( I FLAG* ICHAN) = 0.0 
DO 10 J C H A N = T C H A N » N CH A N 
I NUM* I NUM+ 1 
C ( I FLAG » I NUM ) =0, 0 
10 CONTINUE 
5 CONTINUE 
KN T = 0 

40 CONTINUE 

IF (NCNT .OF, NSCANS) GO TO 70 
NFL G 7=0 

RfaD(LT12 ) (NdaT ( J J 1 ,JJ=1 *NSPS) 

NcNT=NCNT+1 

NFLGI=1 

nflag?=i 

DO 20 J J=1 »N$P$ 

CALL CtFT ( XII ) *NSP S * 0 , NCHAN * N Sr a NO » L T 1.0 * tfRR»NfLAG2 » 

INSTART » Np T LG » MOD f * A|cR F * ITYPf,MSFC ) 

IF ( NhaT ( JJ ) ,NF, N) GO TO pO 

KNT=KNT+1 

NFLG?=1 

AIsFL0AT(KNTi 

INUM=0 

DO 73 ICHAN = 1 ,NCHA:N 

XRAR ( I FLAG* I THAN ) = { 1.0-1 .0/ At )*XRAR ( I FLAG * I CHAN ) +X ( TCHaN ) /A I 
DO 26 JCHAN=ICHAN*NfhaN 
INIjM = iNUM+1 

r < T FLAG* I NUM ) = ( 1 .0-1, 0/A I )*0 { T FLAG* T NUM ) +X( ICHA.N J*X ( JghAN ) /A T 
26 CONTINUE 
25 CON TINUF 

GO TO 70 
30 CONTINUE 

IF (NFLGG .FQ, 1) GO TO ?0 
IF (NDAT(JJ) . NF. N + ] ) GOTO 20 
NS AV=NCNT 
NFLG3=1 
20 CONTINUE 

C WR I Tf ( 6 » 1000 ) NcNT ,KNT jNSA V»Nrc^uP *.N ,NfLg2 »NFLG3 

1000 FORMAT (IX* 718) 

IF (NFLG2 .NF. 0) GO TO 40 
IF (NFLG3 .FQ. 0) GO TO 40 

nrckup=ncnt-nsav+i 

CALL RSRFCD(LTlO,NnrKUP*I4 »RF) 
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S UR ROUT I NF aMTRX ( I FLAG jXRAR , f , A , NrHAN 1 
ni VPNS ION XraR { 4S » 1 ? ) »f ( 4S ,pf3 ) 
nOlJPLR PRECISION A(i?>i?) 

I N> 'M s 0 

RO 1 I tha N = 1 , Nfh AN 
HO ? JCHAN=ICHAN , Nfh AN 
I NUN= I NUM +1 

A ( I CHAN * JCHAN ) =C ( I FLAG » I NUM ) -XRAR ( I FL AG » I CHAN ) *XBAR ( I FL AG » .JCHAN ) 
A ( J C H AN » T C H A N J = A ( I r H A N » J F H A N ) 

? CONTI NUF 

1 CONT I NIJF 

RFT'lRN 
R N n 


Si ip ROUT I NF ROT A { I FL Afi »R0T » F I GRN ,NCHA.N , A , S T GMa t 

OTHFNSION ROT ( 4 R » 1 ? , 1 ? ) 

n r MRN A I ON S T G M A ( 4 , 1 ? ) 

nO'JRL R P R R r T G T ON A ( ’ ? ♦ 1 F ) 

o n y R L R P P R F T S t O N FircNf i ? , 1 ? ) 

no 1 ICHAN=1 * NFH AN 

SIGMA ( IFLA0,iCHAN)=A( ICHAN.ICHAN) 

no 2 JCHAN=1 »Nfh am 

ROT ( I FL AF * I CHAN * JCH AN ) = F I f-fn ( JCHAN * I CHAN ) 

? FONT I NUF 

i FONT I Nl.iR 

RF TURN. 

FNn 


SiJRROUTINR KfhrfK ( TFLAG* ROT>X »STGMa,A.SUM,NfhaN) 

ntWFNSlON ROT <43* 12 .12 ) » SIGMA (40*12) , X( 1 1 
A si <M = n, n 

no S T CH A N= 1 » Nch aN 

S' iM = n, o 

no 4 JCH A N = 1 » NCH a N 

SUM = SUM+ROT ( I FLAG , T fraN . JCH AN I #X ( JfhAM ) 

4 CONTINUE 

A S U M = A S U M + S U M * S U M / S T G M A ( I FL AG, IFHAN ) 
s CONTI NUF 

rrturn 

RAJ n 
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SUBROUTINE CLASS (Nr LASS , NT FST »NP ASS ) 

COMMON /LAB 1 /XraR ( 43 » 12 ) » S I gMa ( 43*12) ,ROr ( 43 » 12 , 12 ) 

COMMON /LAR2/X (12) , ALPHA ( 49 ) *NSPS *NSC ANS *NchAN ,LT9 , LTlOvLTl 1 * LT12 > 
1.LT1 3 »LT 1 » I X »NDUMMY » 

1 NSTART » NS TOP , 

1 NrTLg *MOdf » I TYPF »MSFC, 1 4, NCR f, 

1 N SK I P , I NC X , I NC Y , N ST X * NST Y 
OTMFNSION W(12 ) »mTap (3) 

OTMFMSION NDAT ( 255 *"* ) ,PRNT ( 755 ) 

DIMENSION COM ( 24 ) 

FQIJIVALFNCP (COM ( 1 ) , NSPS ) 
rfwind ltd 

RFWIND LT1 0 
REWIND LT 1 
REWIND LT 1 ? 

RFWTND LT 1 3 
CZ FCH-NCHAN 

IF ( NS< I P . C Q . 0) so TO 601 

DO 60? 1=1. *NSK lP 

CALL SKRS I N ( L T | n , 1 » NOP ) 

60? CONTINUE 
601 CONTINUF 

Rfad(LTP) ( rOM (M*T = 1,?4) 

RcaD( LT9 ) ( (XR AR ( t ,J) , T = 1 *Nrl. ASS) * J = 1 *12 > 

Rfad ( LT9 1 ( ( SK-Ma ( t , J) , i =1 , NcL A S.S 1 » J=1 » 12) 

READ ( LT9 ) ( ( (ROT ( j , r CM AN, Jew AN) , 1=1 , NcL ASS) , | ChaN= 1 , No-iaN ) , 

1 JCWAN = 1. * New an ) 

DO 1 1 = 1 ,3 

) M T A R ( I) = 1 

DO 10 I FND-1 , NSC ANS 

Rfad ( LT l? ) (NdaT ( T , 1 ) * I = 1 , NSPS ) 

NFLAG?=1 

DO ?0 ISURN = 1 ,NSPS 

call G F T ( X ( 1 ) , N S P S , r , N C H A N , N $ C A N 0 , L T 1 0 , 1 F R R , N F L a G 2 » 

INSTaRT »NpTLG»MCDF,(s|rRF* T T Y P F * M S F C ) 

IF (NDAT ( IS'J(»N,1 ) .ST. 0 ) Nd*T < i SU^N * 1 ) =0 
SMALL*! .76*CZFCW 
DO ?S I CLASS=1 , NCL ACS 
DO SO ICH.AN = 1 ,NCHAN 

W( I CHAN ) =X ( ICHAN)-XRAR( ICLASS » I CHAN) 

SO CONTINUF 
AS'iMsO.O 

DO 35 ICHAN=1 *NCHAN 
.SU'M= 0. 0 

DO 40 JCHAN = 1. >NCHAN 

SUM=SUM+W( JchaN)*ROT( r CLASS, ICHA/M , JCHAN) 

40 CONTINUF 

41 AS U M = A S U M + S U M * S U M / S T G M A ( I CLASS * ICMAN ) 

36 CONTINUF 

IF ( ASUM ,GT. SMALL) GO TO ? 5 
SMALL* A SUM 

NDATC ISU(?N,1 ) = I CL AS S 
?S CONTINUF 
1016 FORMAT (IX *316 ,F10. ? ) 

?0 CONTINUF 
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WRlTFlLTl ) ( Nd AT ( I *1 ) *1=1 *NSPS ) 

10 CONTINUF 

END FILF LTl 
RFWIND LTl 
RFWIND LTl? 

RFWIND LTl 0 
DO 610 I Z = 1 » N S C A N S 
I Y=MT AB ( 1 ) 

RFAD(LTl) (NDAT( TA»IYi »TA=1»NSPS) 

ntfmp=mt ab ( 1 ) 

MTAB(1)=MTab(2) 

MTAR(3)=MTap(3) 

MT AB ( 3 1 =NTEMP 

IF ( IZ ,LT . 3) GO TO 610 

DO 630 IA=NSTART*NSTOP 

IF ( I A .FQ. 1 1 GO TO 630 

IF ( I A+l .GT. NSTOP 1 GO TO 630 

T IYaMTAR ( 3 ) 

IF (NPASS .pQ. NTFST) GO TO 631 
IF ( NDAT ( I A ♦ I I Y } .LT. 0) GO TO 620 
621 CONTINUE 

I M = M T A B ( 1 ) 

I N=MT AB ( 2 ) 

M=ND AT ( I A , I M ) 

N = N D A T ( I A— 1 , IN ) 

IF (M .NF. N) GO Tn 630 
I LaMTAB ( 3 ) 

L=NDAT(IA»IL) 

IF ( M .NF. L) GO TO 630 
NDAT (I A» 1 1 Y)=M 
GO TO 620 
630 IMaMTABM ) 

MsNDAT ( I A- l » IM ) 

I N=MT AB ( 3 ) 

N=ND AT ( IA-1 * IN) 

IF (M .NF. N) GO TO 620 
L=NDAT( I A+l » I M ) 

IF (M .NF. L) GO TO 620 
NhaT ( I A » 1 1 Y ) =M 
630 CONTINUF 

IF ( IZ .LT. 3 ) GO TO 610 
LsMTABIl ) 

WR T Tf ( LT 1 3 ) (NDATIt ,L) ,1=1 .NSPS) 

610 CONTINUE 

DO 611 1=2,3 

L=MT AB ( IU 

WR I Tp (LTl 3 1 (NDAT(TL.,L),IL = 1 ,NSPS) 

611 CONTINUF 
RFWIND LTl 
RFWIND LTD 
RFWIND LT 1 3 
RFWIND LTlO 
LOW=NSTART 
LWFR=1 

BOO CONTINUF 
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NHI =L0W+1 ?0-i 
NUPPFR=LWFR+ i ?0 -i 

IF (NUPPFR .rT. NSPS ) nuppfr=nsps 

iniF=NUPPFR-LWFR+I 

WRITE (6, 1007) 

1007 FORMAT ( 1 HI ) 

CALL LARFLA(LOW»NHI ,1 ) 

DO 801 T 1=1 »NSCANS 

Rfah(LTT7 ) (NDAT( JJ,1 1 »JJ=1 * NSPS ) 

DO 808 JJ=LWFR. NUPPFR 
IP=NdAT(JJ»1 1 
IRND=IB- ( IB-1 ) /45*4S+2 
PRNTI JJ)=ALPHA(IRND) 

808 CONTINUE 

CALL PLTBFAfPRNT(LWFR) * I D I F * NrLK , I N cX » I NCY , N8TX , NST Y , NcRE » 
iNFLAGX.NFLAGl) 

WR?Tf(6»1 008) II* (PRNTI JU) ,JJ=LWFR, NUPPFR) 

1008 FORMaT(4X*!6,1h**120A1> 

80) CONTINUF 

REWIND LT 1 8 

NFLAG1=0 

NFLAGX=0 

LWFR=NUPPFR+i 

LOW=NHI+l 

IF (NUPPFR ,LT, NSPS ) GO TO 800 
802 CONTINUF 

NSCANS=NSCANS-7 

return 

END 
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